Job Description & Details
Azure Data Engineer role focused on AI‑enabled data pipelines for a healthcare client. You’ll be the go‑to person for moving massive clinical and pharmacy datasets through Azure and Databricks, and you’ll get to sprinkle LLM‑type AI features on top. It’s a senior hands‑on gig that expects you to hit the ground running with minimal onboarding.
What You'll Actually Be Doing
You’ll design and maintain ingestion pipelines that pull data from disparate health‑system sources (EHRs, pharmacy claims, 340B data) into Azure Data Lake and then transform it in Databricks for downstream analytics. Expect to work directly with VP‑level stakeholders to translate business questions into data models, and you’ll be responsible for performance‑tuning Spark jobs, handling schema evolution, and ensuring compliance with healthcare data regulations. Occasionally you’ll prototype simple LLM integrations—think summarizing clinical notes or generating dosage recommendations—without needing deep research‑level AI expertise.
The Core Tech Stack
The non‑negotiables are Azure (Data Factory, Synapse, ADLS) and Databricks (Spark, Delta Lake). You must be comfortable writing scalable PySpark/Scala jobs, orchestrating pipelines with Azure Data Factory, and modeling large, PHI‑laden datasets. Healthcare domain knowledge (especially 340B pharmacy workflows) is a huge plus because the data shapes are quirky and compliance‑heavy. A dabble in LLM APIs (Azure OpenAI, Hugging Face) is useful but not a make‑or‑break skill.
Interview Expectations
- “Walk me through how you would design a scalable ingestion pipeline for 100TB of pharmacy claim data that must be GDPR/HIPAA compliant.” They’re probing your ability to pick the right Azure services, partition strategy, encryption at rest/in‑flight, and how you’d validate data quality.
- “Explain the trade‑offs between using Delta Lake’s Z‑order vs. traditional partitioning for query performance on a mixed‑type clinical dataset.” They want to see if you understand Delta optimizations and can justify choices based on query patterns and storage cost.
Application Advice
Tailor your resume to scream “Azure Data Engineer + Healthcare” – put Azure Data Factory, Databricks, Spark, and Delta Lake right up top. Highlight any 340B or pharmacy‑related projects; even a short stint with claims data will get you noticed. Sprinkle in “LLM integration”, “AI‑enabled pipelines”, and “VP stakeholder communication” to match the soft‑skill cues. Keep the experience bullet concise: 12+ years building end‑to‑end data solutions in Azure for regulated health domains – that exact phrase will help you pass the ATS.