ML Engineer Interview — ML Systems, MLOps, Deep Learning
The MLE loop overlaps with both software engineering and applied ML, but the centre of gravity is ML system design: feature stores, online inference, training pipelines, A/B testing infrastructure, and model lifecycle. Algorithm coding still appears, and increasingly so do deep-learning-specific questions on transformers, RAG, and LLM fine-tuning.
Coding / algorithms (45–60 min). One LC-medium problem plus a focused ML coding question — implement k-means, write a numerically-stable softmax, vectorize a slow pandas loop with numpy.
ML system design (60 min). Design recommendations for a feed, fraud detection at the edge, ad CTR prediction, or LLM-powered search. Focus on training data pipeline, feature engineering, model serving, and evaluation.
Software / infra (45 min). Distributed training basics, GPU memory math, model quantization, serving stacks (Triton/TorchServe/vLLM), feature stores.
Behavioural (45 min). Working with researchers vs product, ML rollout incidents, cross-team data ownership, model deprecation.
Top Machine Learning Engineer technical questions
These are pulled from interview-debrief patterns we see most often across Data & Analytics roles. They are not memorization fodder — interviewers reword them constantly. Practice the underlying skill, not the wording.
Design a recommender for a video feed with 100M users and 50ms p99 latency budget.
Why would you choose a two-tower retrieval model over a cross-encoder for first-stage ranking?
Implement a numerically-stable softmax. Why does the naive version overflow?
Walk through how a transformer attention layer computes a single output token.
What changes when you fine-tune vs use LoRA vs prompt-tune?
Design an LLM-powered customer-support search with citation. Cover retrieval, reranking, generation, and evaluation.
Your CTR model AUC is high in offline eval but online lift is flat. What do you investigate?
How do you monitor model drift in production? What signals correlate with real degradation?
Explain why batch normalization can hurt at inference. What would you use instead?
Design a feature store. What problems is it solving that ad-hoc pipelines don't?
Train/serving skew — give two real causes and how to detect them.
How would you A/B test a recommendation model when results affect downstream user behaviour?
Behavioural questions
Describe an ML model you shipped to production. What was the offline-vs-online gap?
Tell me about a model you killed. Why?
When has working with researchers (or with product) created friction? How did you resolve it?
Walk me through the most painful debugging session you've had with a training pipeline.
Preparation tips for Machine Learning Engineer candidates
**Lead system-design rounds with the data, not the model.** Where does training data come from, how is it labelled, refreshed, joined? Models are 10% of an MLE design problem.
**Know the LLM stack.** Vector DBs, RAG, eval frameworks (Ragas, LangSmith), serving (vLLM, TGI), and basic GPU memory math come up constantly in 2026 loops.
**Separate offline metrics from online business KPIs.** Strong candidates always say 'we'd verify with a live A/B against [business metric]' rather than stopping at AUC or BLEU.
**Practice writing PyTorch from a blank file.** Many MLE rounds ask for a small training loop or a custom layer in pure PyTorch — no copilot.
Practice with the AI mock interviewer
Panor's AI Job Assistant runs voice-based mock interviews tuned to the Machine Learning Engineer role. It ad-libs follow-up questions, calls out red flags in your answers, and produces a transcript with rubric-graded feedback. Resume × JD matching is also included — paste a target job description and the assistant rewrites your bullets in STAR format with keyword alignment scoring.
Strong candidates with relevant experience generally need 4–6 weeks of focused prep for a competitive Machine Learning Engineer loop. Career switchers should plan on 8–12 weeks, weighted heavily toward the data & analytics fundamentals.
Do I need to grind LeetCode?
For most Machine Learning Engineer loops in 2026, depth on a curated set of 60–80 problems beats grinding 400. Focus on the patterns the questions above test, not problem volume.
Is the format the same at startups vs Big Tech?
No. Big Tech tends to over-index on coding and system design; startups put more weight on judgement, speed, and 'will this person carry the team'. Read the JD and ask the recruiter for the explicit loop structure — they will tell you.