The role is a senior‑level AI leadership position where you’ll own end‑to‑end GenAI product pipelines for big‑ticket verticals like telecom, health and manufacturing. It’s not a vague research gig – they expect you to ship production‑grade models that actually move the needle for enterprise customers.

What You'll Actually Be Doing

You’ll spend most of your day translating business problems into LLM‑driven solutions, designing RAG architectures, and then shepherding those models through the whole dev‑ops lifecycle – from prompt engineering to CI/CD, monitoring, and cost‑control on Azure OpenAI or Anthropic. Expect frequent triage on model drift, latency spikes, and budget overruns, plus occasional deep‑dives with domain SMEs to fine‑tune data pipelines.

The Core Tech Stack

The must‑know stack is LLMs (Llama, Mistral, Gemma, Phi, Qwen) plus the RAG pattern, all wrapped in Azure OpenAI or Anthropic services. You need solid experience with prompt engineering, vector stores, and retrieval‑augmented generation, plus a production mindset around Docker/Kubernetes, Terraform, and cost‑optimization dashboards. If you’ve built an end‑to‑end GenAI service that scales to thousands of requests per day, you’re already speaking their language.

Interview Expectations

Design a cost‑effective RAG pipeline for a telecom churn‑prediction use case. The interviewer will watch for how you pick vector DBs, chunking strategies, and how you instrument Azure cost‑monitoring. They want proof you can balance latency, relevance, and a $‑budget.
Explain how you would mitigate hallucinations in a medical‑record summarization model deployed on Azure OpenAI. Look for a discussion around prompt guardrails, post‑processing filters, and human‑in‑the‑loop validation. They’re testing depth of safety awareness, not just model performance.

Application Advice

Tailor your resume to scream “GenAI production expert.” Lead with verbs like designed, deployed, optimized and sprinkle the exact tech keywords: LLM, RAG, Azure OpenAI, Anthropic, Llama, Mistral, Gemma, Phi, Qwen, NLP, cost‑optimization, enterprise AI strategy. Highlight any telecom/healthcare/manufacturing projects and quantify impact (e.g., reduced inference cost by 30%). A concise one‑page summary that mirrors the bullet list in the JD will get past most ATS filters.

Lead ML Engineer / Senior AI Consultant

Job Description & Details

What You'll Actually Be Doing

The Core Tech Stack

Interview Expectations

Application Advice