Tech Nonprofit Job Board

The Tech Nonprofit Job Board features open roles from organizations around the globe. Whether you're a job seeker ready to match your skills with a mission or a tech nonprofit looking to hire top talent, you're in the right place.

Hiring? If your tech nonprofit isn't listed yet, submit this form. Reach out to jobs@ffwd.org with questions about the Tech Nonprofit Job Board.

ML Engineer (Compilers)

Adalat AI

Adalat AI

Software Engineering, Data Science
Posted on Mar 27, 2025

ML Engineer — Runtime

About Us

Adalat AI is a legal-tech startup revolutionizing the Indian judicial system through advanced artificial intelligence solutions, by building an end-to-end justice tech stack. Operating across 10 Indian states and supported by some of the world's largest foundations, we're dedicated to eliminating judicial delays and enhancing access to justice through innovative technology.
Our solutions, including state-of-the-art ASR models for Indian languages, have been successfully implemented in multiple high courts, with our recent launch at the Delhi High Court marking a significant milestone. Founded by a team combining expertise in law, technology, and computational linguistics, with credentials from Harvard, Oxford, Cambridge, MIT, IITs, IIIT Hyderabad, Adalat AI has earned recognition through prestigious competitions and partnerships, demonstrating our commitment to bringing India's courts into the digital age alongside other modernized systems like UPI, Aadhaar, and online taxation.

About the Role

You’ll be a key part of the team building our Legal Intelligence runtime stack — helping us serve real-time speech recognition, retrieval, and summarization in low-bandwidth, resource-constrained environments.
This is a full-time position focused on making our ML models fast, lightweight, and deployable across thousands of Indian courtrooms — from remote district courts to the Supreme Court. As an early member of the team, you will:
Collaborate closely with the founding team to enhance model performance, enabling seamless operation for judges and stenographers.
Identify and implement innovative solutions to optimize machine learning models for various hardware architectures, including CPUs and GPUs.
Work in close collaboration with cross-functional partners in design, backend, and frontend functions.
Solve complex problems related to model efficiency and scalability.
Build cost-effective and scalable systems that can operate efficiently in resource-constrained environments.

Responsibilities

Design and optimize speech and text pipelines — especially for Indic languages.
Implement compiler-aware workflows that reduce latency, memory, and energy usage.
Apply compression techniques (quantization, pruning, distillation) to deploy models on diverse and constrained hardware.
Collaborate with hardware teams to leverage new CPU/GPU/accelerator features via MLIR, LLVM, or ONNX.
Benchmark, debug, and stress-test inference across thousands of hours of real-world audio and documents.
Build infrastructure for scalable, cost-efficient inference under heavy workloads.

In a Year You Would Have

Optimized our end-to-end ML stack for 5,000+ courtrooms running 10–12 hours daily.
Solved some of the toughest runtime challenges in our stack — from dialect variability to model drift in noisy courtroom settings.
Delivered state-of-the-art performance in legal speech and text understanding running on real-world hardware.

Qualifications

You don’t need to meet every single qualification — we value diverse backgrounds and non-linear paths.
Educational Background:
Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related field from leading institutions.
Professional Experience:
4+ years of experience in machine learning optimization, model compression, compiler development, or related areas.
Technical Skills:
Strong programming skills in Python or C/C++
Experience with deep learning frameworks (PyTorch or TensorFlow)
Strong understanding of compiler architectures, including front-end and middle-end optimizations, scheduling, and code generation.
Familiarity with compiler frameworks such as LLVM or MLIR.
Hands-on experience with model optimization techniques, including quantization (e.g., Post-Training Quantization, Quantization-Aware Training), pruning, and distillation.
Knowledge of hardware architectures and experience deploying ML systems in resource-constrained environments
Additional Qualifications (Preferred):
Experience with advanced batching strategies and efficient inference engines for large language models.
Familiarity with retrieval-augmented generation (RAG), graph neural networks (GNNs), and agentic frameworks.
Experience contributing to research communities, including publications at conferences and/or journals.

Perks

WFH with flexible work hours.
Unlimited PTO
Autonomy and Ownership
Learning & Development resources
Smart, Humble and Friendly peers
Maternity and Paternity leaves
Contacts within the Harvard / MIT / Oxford ecosystem.
Reach out to: careers@adalat.ai