Smart Cumulus
Overview
Smart Cumulus is a comprehensive infrastructure project aimed at securely and efficiently exporting, transforming, and managing healthcare data using modern cloud-based tools. The project focuses on leveraging Bulk FHIR APIs from EPIC systems to extract large volumes of medical data, which is then processed using custom ETL (Extract, Transform, Load) pipelines deployed on Azure. The cleaned and de-identified data is subsequently stored and shared with the study team via AWS for various analytical and research purposes.
ICS' Role
ICS (Information and Computing Services) is responsible for the design, deployment, and maintenance of the Smart Cumulus infrastructure. This includes setting up virtual machines, managing dockerized workloads, and ensuring data security and compliance.
Contacts
Internal
Name | Role |
---|---|
Nicole Venteris | PM |
Ian Lackey | Tech |
Dr. Adam Wilcox | PI |
External
Name | Role | Contact Info |
---|---|---|
Dr. Ken Mandal | PI (BCH) | kenneth.mandl@childrens.harvard.edu |
James Joneses | PM (BCH) | james.jones@childrens.harvard.edu |
Project Management
Major Tasks & Initiatives
Important Dates & Notes
This project ended in 2024.
Important Deadlines
Phase 1: 9/1/2022-1/14/2023
Phase 2: 1/14/2023-3/14/2023
- Initial data export setup
- First batch of healthcare data export
- Final deployment and handover: 04/2024
Deliverables
- Configured virtual machine and storage in Azure
- Create secure data export pipelines using bulk FHIR client and cumulus ETL
- De-identified data uploaded to AWS S3 using cumulus-library
Standard Meetings
- Meetings have been cancelled but included PI meeting, technical calls, and planning meetings.
Administrative Details
Tracking Time
Please ensure all work involving Smart Cumulus is categorized underneath the "Smart Cumulus" project in Tracking Time.
Digital Landmarks
Project Web Pages
Document Repositories
- WUSTL Technical Documents on Box
- WUSTL PMO Documents on Box
- Cumulus Pilot Network Documents
- WUSTL ChatGPT Cumulus project notes
InfoSec Report
There were two InfoSec requests submitted for this project. Both exports can be found in the project's Box folder. The OneTrust request IDs are 1045 & 1096.
Code Repositories
- ICS Cumulus Repository
- Bulk FHIR Client GitHub Repository
- Cumulus ETL Github Repository
- Cumulus Library Repository
- Smart on FHIR Library Repository
- OMOP on FHIR
Technical Information
Guides, Tutorials, & References
Glossary / Acronyms
Term | Definition |
---|---|
BCH | Boston Children's Hospital |