AWS Data Engineer

AWS Data Engineer

AWS Data Engineer

Upwork

Upwork

Remote

6 hours ago

No application

About

Position: AWS Data Engineer Experience: 5–7 years Domain Preference: Healthcare data engineering About the Role We are seeking a highly skilled AWS Data Engineer with 5–7 years of experience to join our data engineering team. The ideal candidate will design, build, and maintain robust, secure, and scalable data pipelines on AWS. Prior experience in healthcare data projects (EHR/CCD/HL7/FHIR, claims, HCC) is preferred. You will be responsible for creating and enabling Medallion Architecture (Bronze -Silver-Gold layers), optimizing large-scale data processing, ensuring compliance (HIPAA, SOC II), and supporting downstream analytics and other initiatives. Key Responsibilities Design, implement, and maintain ETL/ELT pipelines for ingesting data from multiple sources (APIs, EHRs, Snowflake, CSV, FHIR APIs, etc). Build and optimize data lakes and warehouses using AWS services (Glue, Athena, Redshift, S3). Implement Medallion Architecture (Bronze/Silver/Gold layers) for data standardization and enrichment. Develop on-demand and scheduled data ingestion workflows to ensure flexibility for both real-time and batch needs. Work with Airflow and PySpark to orchestrate large-scale data processing pipelines. Build observability and monitoring frameworks to track pipeline health, performance, and data quality. Implement error monitoring and notification systems (SNS, ELK, CloudWatch, Sentry) to detect and resolve issues proactively. Develop dashboards to visualize pipeline health, ingestion metrics, and error trends, enabling quick identification and resolution of issues. Ensure data quality, lineage, and integrity through automated monitoring and validation frameworks. Implement security best practices (encryption, KMS, Secrets Manager) for sensitive healthcare data. Collaborate with the product team and stakeholders to support analytics and other initiatives. Automate workflows with CI/CD pipelines. Support compliance with HIPAA, SOC II, and healthcare-specific regulatory requirements. Ensure smooth deployment of pipelines into production while maintaining separate development environments for testing, validation, and continuous improvements. Required Skills & Qualifications 5–7 years of experience as a Data Engineer with a strong AWS background. Expertise in AWS services: Glue (ETL), Athena, Redshift, S3, Step Functions, Lambda Kafka (streaming ingestion) CloudWatch / CloudTrail / SNS (monitoring, auditing, notifications) Strong hands-on experience with PySpark and Airflow. Hands-on experience with modern file formats such as Parquet and Avro for efficient data storage and processing. Strong knowledge of data modeling (star/snowflake schema, healthcare ontologies a plus). Hands-on experience with Medallion Architecture and best practices for scalable data lakes. Experience with observability frameworks (Prometheus, Grafana, AWS CloudWatch dashboards). Proven experience with error monitoring, notifications, and self-healing pipelines. Proficiency in Python for data engineering and automation. Experience with healthcare data standards (FHIR, HL7, CCD, claims, ICD-10, CPT, HCC). Good knowledge of SQL and performance optimization in Redshift/Athena. Experience deploying and managing data pipelines across multiple environments (dev, staging, prod), with best practices for CI/CD, version control, and environment isolation.