Why Radical Transparency is the Future of Healthcare Analytics
In the world of data analytics, we frequently encounter this common pitfall: the choice between transparent data versus data that is easy to use. Unfortunately, neither approach serves healthcare stakeholders well.
We often see “transparency” used as an excuse to avoid essential data curation and cleaning work; “transparent” becomes a justification to simply not validate data and pass on “dirty” or “raw” data to the user. On the flip side, “ease of use” is frequently utilized as a defense, turning the data sourcing and cleaning process into a black box to protect proprietary information. Both approaches ultimately limit the data’s potential use cases and reduce its utility for effective healthcare analytics.
Breaking the Proprietary Barrier
The tension between data transparency and the protection of proprietary data and intellectual property (IP) is central to this debate. When IP protection limits how data can be used, its value diminishes. Real world evidence (RWE) generated from data that isn’t transparent is evidence that can’t be trusted; the FDA and other regulators are demanding greater evidence of data provenance to ensure that the chain of custody over the underlying data is clear and specific data points can be validated. Likewise, the traditional proprietary label applied to data curation and cleaning steps must be made similarly transparent so that consumers of the output analytics can judge for themselves that data was transformed in a manner that preserved the underlying integrity.
We firmly believe that the industry must pivot toward radical transparency. The core goal must be to maximize usability and value while achieving more transparency.
Tools for True Transparency
To implement radical transparency, we must move beyond dumping raw data and letting analysts sort it out. True transparency requires specific tools that allow the user full visibility into the data’s quality and lineage without sacrificing usability. Veritas has implemented an approach that we believe is consistent with FAIR principles and delivers data of high utility to users
- Complete Sourcing and Records: We commit to the delivery of only the records that meet minimum standards for completeness and quality, and to the provision of complete sourcing information for each data point. Furthermore, we incorporate references to necessary standards, such as the FDA Real-World Evidence (RWE) guidance, emphasizing robust data provenance
- Confidence Scoring: Users shouldn’t need to build complex algorithms just to gauge the trustworthiness of a record. We create confidence scores for every record based on the source(s) from which it was derived, and how many of those sources agreed on each data value, to illustrate the robustness behind each record
- Handling Duplicates: Where multiples sources are consistent, we consolidate the data into a single record and cite all underlying sources. Where there are inconsistencies, instead of arbitrarily removing outlier data and hiding it from analysts, we report that data as a separate record with a lower confidence score. Further, we include a “likely duplicate” indicator for these outlier records, and use pointers to direct users to the better record for that data point. These pointers make further deduplication straightforward for the user without removing the actual duplicate records from the source material
- Open Discussion of Impact: Transparency means being honest about the data’s inherent quality challenges. We openly share the impact of outlier data we provide, detailing how many of our records are considered low confidence or duplicates, and guiding users in how to incorporate this data (or not) based on their use case

Accessing Transparency Behind the Data
Building transparency into your data process is important, but you need to give users a way to access that transparency information. At Veritas, we provide access through our dedicated web portal. We view our system not just as a data repository, but as an Index:a highly structured and transparent map to every data point. By delivering radical transparency, we replace the limiting black box of “proprietary data curation” with an open system that elevates trust and dramatically expands the potential utility and application of analytics for every client.
Request More Information
Speak to a Veritas expert to learn how subscribing to our data can make your organization’s operations and analytics more effective.
