Bruce Capobianco

Senior Director, Technology, Real World Evidence Strategy & Analytics

How duplicate data environments, redundant data subscriptions, and siloed data access are not providing a good enough return on investment for Real World Evidence (RWE) generation

Advances in technology, from the massive storage and computing options now available through cloud computing to advances in algorithmic processing, have empowered organisations to consider secondary RWD sources as viable options for evidence generation across a wide spectrum of use cases. Within the last few years, the exploration and practical use of secondary data has grown exponentially and is now fuelling a new wave of digital disruption.

New insights we never thought possible are now becoming achievable. The challenge presented with this new era of data use sees unprepared organisations spending massive amounts on secondary data for specific one-off use cases, without thought to how those data may be harmonised across the organization, which results in large scale data silos throughout the enterprise. Data silos bring an inherent inefficiency and create roadblocks to achieving the desired success.

Common Barriers

Cost - The increasing and sometimes duplicative spend on data subscriptions is taking its toll on margins. In most cases, the cost of data is now supplanting the development labour to generate insights from it. Having multiple instances of the same data can double and even triple that cost, making profits dwindle and causing an immense increase of time to outcomes because the same data has to go through the on boarding process multiple times.

Accessibility - Tracking down available data sets can be a laborious task when data is siloed across the organisation. The time spent trying to ascertain if the data needed is already owned by the enterprise, and if so, how to obtain access to it, causes a great decrease in productivity. The more time a data scientist spends chasing down data, the less time they have to analyse the data, create meaningful output, and meet deliverable deadlines.

Operational - From an operational perspective; redundant storage, backups, and computing power, are causing a significant increase in spend on infrastructure. Duplicating time spent on compliance and privacy processes for redundant data sets can certainly add up also. 

Inefficient Processing - Not having access to a view of the full breadth of available data is limiting the ability to meet deliverables and draw insights that are as compelling as they could be. In addition, when teams get access to siloed data without a proper data dictionary or catalogue, it becomes extremely difficult to make inferences and effectively use the data.

Exclusivity - Business units are now becoming much more protective of their data sets, creating barriers for collaborative research. Each team wants to preserve their data set to tell their story exclusively and may not want others to gain any insights from it.
With these barriers surfacing, inefficiencies are becoming more wide-spread and the need to shift direction and work on consolidation planning exercises is increasing. To prevent having to go down this long and arduous path, teams across the organisation can be proactive and think before they leap by developing an overarching data strategy framework and a shared delivery model through the use of a platform.

Collaborative Solutions

Data Strategy - Driven by executive leadership and working with data science and operational participants, a high level data strategy must be designed. This strategy needs to start upstream to identify potential use cases across the organization and align on the most effective data sets to support the analyses for those use cases. Once the data has been identified, it is imperative to ensure it has the appropriate depth and breadth, and protocol and data density reports should be requested to ensure the targeted data sets are comprehensive. Once the data set has been vetted, proper infrastructure design should be built to address operational issues like how the data is to be ingested, linked, normalised, distributed, and maintained to ensure the most efficient use of the data. The more standardised the data is, the more flexibly it can be transformed to serve many use cases.

Centralised Platform - Having a centralised platform is critical to success. The platform must be fit-for-purpose and proficient in serving the diverse set of use cases derived from the data strategy. Multi-use ability is critical to capitalise on the investment of both the data and the platform. Being data agnostic, having the ability to ingest and normalise various data sources, means the organisation is not bound to any one particular data set and allows for the most expansive use of the platform and the most targeted analyses. Thanks to advances in technology, the ingestion and linking engine within the platform can be automated through machine learning, rules-based processing for data parsing and quality. The resultant common data model could be aligned to OMOP standards; however, depending on the rigidity of the design, a more flexible model may be better suited for the many tasks at hand for a shared data model approach. From a security and awareness perspective, role-based access and broadcast services must be in place to ensure the organisation has visibility into the data. Finally, and certainly one of the most critical areas, is the need for securing the data in a HIPAA and GDPR compliant platform that also boasts SOC2 level certification to ensure the appropriate privacy safeguards are in place to protect your prized data assets.

Achieving success is a dependent on breaking down data silos and creating a more collaborative ecosystem. Consider taking the time up-front to craft a detailed data strategy along with generating a shared data approach across the organisation. By getting everyone energised and aligned with the power of sharing assets, you can maximise the opportunities to enhance your business with Real World Data.

Learn more

To learn more or to discuss your real world data needs, contact our Real World Evidence team.

About the author

Bruce Capobianco has over 25 years’ experience in the architecture, development and implementation of complex big data solutions. He leads a team to develop, enhance, and maintain RWE technology solutions for ICON clients. He has a proven track record of identifying and implementing secure, usable and enduring technologies that augment business processes and optimize productivity. At Syneos he led a global team of architects, developers, PMs and SQA staff in the development of a HIPAA-compliant, trial patient recruitment system, and established and drove disruptive technology trends for competitive advantage.