Building a Cloud-Based Data and Analytics Platform
Like many small and midsize life science companies, this growing specialty-focused pharmaceutical company outsourced much of its data management and was increasingly facing data-related challenges and inefficiencies posed by an over-reliance on external vendors when trying to address key business questions. The company had purchased several data sets, but due to the absence of internal data infrastructure, teams across the organization lacked access to the data and instead had to rely on their vendors to perform even simple analysis. Furthermore, turnaround time on requested analyses was slow and data quality often came into question, and they could no longer afford to continue along this path. They needed easy access to their data in their own centralized location, quality measures implemented, and tools for analysis to support business needs across the organization now and into the future.
We started by breaking down the overall business objective of the client into smaller goals to allow an incremental build of an enterprise-wide, scalable, self-service platform that would serve the data and analytics needs of both our client and its partners. To do so, we followed an agile methodology—breaking up the overall project into shorter “sprints,” with each 8- to 10-week sprint designed to accomplish a specific objective.
The cloud-based platform, or “workbench,” we were to build for our client needed to support data ingestion, quality management, data aggregation, data enrichment, and self-service analytics.
Our cloud-based solution was built in a sandbox environment and included a host of tools that would be used by our client and its partners to monitor data, import and export high volumes of data, run analyses, and generate reports.
After aligning on project goals, objectives, and timelines with the client, we met with various members of the client team to garner key information for establishing business use cases and identifying pain points. We helped the client prioritize the business use cases and data sets to identify high-value quick wins to be implemented for each sprint.
We identified the key data sources that would be hosted on the cloud-based platform, as well as any data integration and enrichment needs for the client. We co-developed the design and test plans with the client team for building the workbench.
We also built and finalized the design and test plans for the quality management module, an important part of the platform we were building. To accomplish this, we defined a range of data quality checks (such as data completeness, data consistency, data capture, data trends, data business rules validation, data exceptions) for each of the prioritized data sets.
Once the plans were vetted with the client’s data strategy team, we implemented both the data workbench and quality management module: (1) we developed the workbench in close collaboration with the client’s global IT team—designing and configuring the environment to be scalable to meet current and future business needs; (2) we linked the workbench with the analytics tools the client selected; and (3) we tested the tools’ functionality and the overall readiness of the workbench environment. We also developed critical documentation for the client that supports data access, provisioning, and visualization for the target use cases.
In building the data quality management module, we started with configuring the data ingestion layer to pull data from multiple sources into the workbench, and then we configured and implemented the data quality management engine to execute quality checks within the workbench. Next, we configured the user interface for monitoring data ingestion and quality and then tested the system for functionality and readiness.
In the final stage of the engagement, we performed user acceptance testing, captured user feedback, and refined data visualization and provisioning. Once finalized, we developed documentation and training materials on the workbench features for the target use cases and assisted the client’s data strategy team in preparing to go live.