Client Success

US-based pharmaceutical and clinical research leader saves time and costs developing new medicines with AWS-powered data insights

For nearly 35 years this US-based pharmaceutical and clinical research leader has been developing life-transforming medicines for people with serious ailments.

The Challenge

Clinical trial data is traditionally scattered across institutions and poorly integrated. In order to speed approvals for life-saving medicines, this client was seeking to access critical data from over 200+ studies and trial programs while cost-effectively standardizing all the data scattered across institutional silos. The ultimate objective was to bring advanced analytics and AI capabilities across an aggregated data set.

The Solution

Several AWS-managed services were used to ingest, process, store, and analyze the client’s structured and unstructured data. The solution was instrumental in creating a secure, flexible, and cost effective data lake for the client.

It was essential to deploy the right AWS architecture with data ingestion capabilities to move an extensive volume of data to the cloud. Persistent streamlined the entire process to ingest multiple data types quickly and easily from the source system into the client’s data lake built on Amazon Simple Storage Service (Amazon S3). Using AWS Database Migration Service (AWS DMS) enabled Persistent to ensure that all the data integration from various external sources was readily available for researchers and scientists to perform ad-hoc analysis. Persistent used AWS Athena and RedShift spectrum services and developed an advanced analytics platform. This approach proved to be an easy, secure, and cost-effective way to integrate the existing research data lake and advanced analytics data mart.

By deploying a visual data preparation tool, Persistent provided clean data for analytics and machine learning purposes. Similarly, fully managed EMR (Elastic Map Reduce) services/jobs enabled the client to categorize and move data reliably between multiple data stores / streams.

The Outcome

With a robust data lake and the data mart, the client can now locate 20% – 30% of earlier missing data. With all data available on AWS cloud, its scientists now have access to ready-to-use data and can share all critical data from research and trial programs.

It has enabled the client to make informed and faster decisions while developing new medicines while ensuring significant time and cost savings.

Technology Used
  • AWS
  • Apache Airflow
  • Nifi
  • Apache Spark
  • Python
  • Git
  • Jenkins
  • AWS Redshift
  • AWS Glue
  • S3 Bucket
  • Parque

Contact us

(*) Asterisk denotes mandatory fields

    You can also email us directly at info@persistent.com

    You can also email us directly at info@persistent.com