Overview
Herein contains the summary of the AWS solution architecture. Since the codebase is written in Terraform HCL, any cloud provider could be easily added if there are perceived benefits.
Last updated
Was this helpful?
Herein contains the summary of the AWS solution architecture. Since the codebase is written in Terraform HCL, any cloud provider could be easily added if there are perceived benefits.
Last updated
Was this helpful?
The AWS Data Lake architecture is composed of 3 distinct Terraform modules:
The serverless API was adapted from the FHIR Works on AWS project. The major difference was the removal of Elasticsearch and converting the CloudFormation templates into HCL for use with Terraform.
We start with Amazon API Gateway handling the RESTful endpoint for the Lambda function and using Cognito to authorise clients.
The lake module defines a data pipeline that transforms the EHRs into a Common Data Model (CDM) that can be queried later on.
The data lake ingests the FHIR records from DynamoDB (defined in the API module) by running a Spark ETL job (using AWS Glue) to transform the data schema from FHIR (patient-level data) into OMOP (population-level data). This also removes patient identifiers from the records so that data analysts can work with it.