ML Engineer · Boston, MA · Available

PRATHAM CHOPRA

I build machines that learn — production ML systems, autonomous AI agents, and the infrastructure to keep them honest.

3.94GPA · Northeastern

Top 250Kaggle · 1,800+ Teams

13+Projects Shipped

3Research Papers

scroll

About

I turn hard
problems into
working systems.

Recent MS graduate from Northeastern University (GPA 3.94, April 2026), building on a B.Tech in CS with AI & ML. I have shipped end-to-end ML systems — from CUDA kernel optimization and full MLOps platforms to autonomous multi-agent pipelines and production analytics dashboards. Currently seeking full-time roles in ML Engineering, AI Engineering, GenAI, Data Science, and Data Analytics.

GitHub LinkedIn Email Me

Languages

PythonSQLC++CUDAHTMLCSS

Machine Learning & Deep Learning

PyTorchTensorFlowMONAIXGBoostScikit-learnKerasLSTMsCNNsTransformers

Generative & Agentic AI

LLMsHugging FaceChromaDBSentence-TransformersRAGReAct AgentsFAISSPrompt Engineering

MLOps & Infrastructure

DockerMLflowAirflowEvidentlyGrafanaONNXGitHub ActionsCI/CD

Data & Visualization

PandasPySparkPlotlyPower BIStreamlitGradioSHAP

02 — Selected Work

What I've
Built

13 Projects

001

Kaggle Competition · Medical Imaging

RSNA Intracranial Aneurysm Detection

Trained a 3D CNN ensemble (EfficientNet3D, ResNet3D, ViT3D) with SE attention and focal loss on 4,000+ 3D CTA scans. Engineered DICOM preprocessing pipelines slashing per-series time from 25 min to under 30 sec. AMP, 5-fold CV, TTA, full W&B tracking.

PyTorchMONAI3D CNNsMixed-PrecisionW&B

GitHub →

Top 250of 1,800+ teams50×faster preprocessing

002

MLOps Platform

Helix — End-to-End MLOps Platform

Full MLOps platform with LSTM anomaly detection and Airflow drift-triggered retraining cutting model staleness from weeks to under 24h. ONNX export + FastAPI serving at sub-100ms p99 latency. Live Evidently + Grafana observability across 5 metrics in 3 environments.

MLflowONNXFastAPIAirflowEvidentlyGrafanaDocker

GitHub →

<100msp99 latency3environments

003

Agentic AI · Multi-Agent System

INQUIRO — Autonomous Research Agent

3-agent system orchestrating paper discovery, retrieval, and synthesis across Semantic Scholar, ArXiv, and PubMed via direct APIs — zero framework dependencies. Hybrid BM25 + dense retrieval with ChromaDB, custom 8-dimension ScholarEval scorer, 154 passing tests.

PythonChromaDBSentence-TransformersDockerSemantic Scholar API

GitHub →

154passing tests<2 minturnaround

004

GPU Systems · Low-Level Optimization

GPU Compute Lab — CUDA Kernel Optimization

Custom CUDA kernels for matrix ops and memory-bound workloads. AMP training delivering +44% throughput with zero accuracy loss. GPU occupancy lifted from 52% to 80%+ via PyTorch Profiler and Nsight bottleneck resolution.

CUDAC++PyTorchAMPNsight

GitHub →

+44%throughput80%+GPU occupancy

005

Analytics · E-Commerce

Customer Journey & Conversion Analytics

End-to-end analytics pipeline across millions of user events in a 3-stage funnel (View→Cart→Purchase). Cohort segmentation + A/B testing via proportions z-test across 50+ product segments. Interactive Streamlit dashboard for live stakeholder exploration.

PythonPandasPlotlyStreamlitScikit-learn

GitHub →

50+product segmentsM+events processed

006

Explainable AI

Air Quality Health Risk Classifier

Multi-class XGBoost pipeline predicting 5 EPA health risk tiers from real-time air quality data. EPA-breakpoint feature engineering, sample-weighting for class imbalance, SHAP per-prediction explainability, live Gradio dashboard with API integration.

XGBoostSHAPScikit-learnGradioPlotly

Live Demo →

5EPA risk tiersLiveAPI integration

007

NLP · Analytics

British Airways — Customer Analytics & Booking Prediction

VADER sentiment + topic modeling on customer reviews. 6 engineered features + Random Forest with Stratified K-Fold CV for booking prediction. Stakeholder-facing ROC curves, confusion matrices, and 3 actionable improvement recommendations.

NLTKScikit-learnPandasMatplotlib

GitHub →

6engineered features3insight categories

008

Deep Learning · Audio

Speech Emotion Recognition

End-to-end 8-class emotion classifier. Two-layer LSTM on RAVDESS + TESS achieving 85.34% test accuracy. MFCC, RMS, and ZCR feature pipeline with noise reduction, ModelCheckpoint + ReduceLROnPlateau for production serialization.

LSTMTensorFlowLibrosaScikit-learn

GitHub →

85.34%test accuracy8emotion classes

009

Agentic AI

Autonomous ISS Viewing Planner Agent

Custom ReAct (Reason+Act) loop built from scratch — no framework. Autonomously plans ISS viewing windows by orchestrating async calls across geolocation, satellite tracking, and weather APIs. Dockerized Streamlit UI with full reasoning-chain transparency and step-by-step thought visualization.

PythonStreamlitDockerREST APIs

GitHub →

Multitool orchestrationAsyncexecution

010

Cryptography · Backend

AgentPay — ECDSA Cryptographic Ledger

Tamper-evident payment ledger using ECDSA digital signatures. FastAPI REST backend for signing, verification, and querying. Full transaction lifecycle — key generation through multi-step verification — in a Dockerized stack.

ECDSAFastAPIDockerPostgreSQL

GitHub →

Tamper-evident recordsFulllifecycle mgmt

011

Data Engineering

NYC Taxi Trip Record — PySpark ETL Pipeline

Distributed PySpark ETL over multi-month NYC Yellow Taxi records. Trip duration, day-of-week, and long-trip flag engineering. Schema validation throughout; Parquet outputs optimized for downstream analytics and model training.

PySparkPythonPandasParquet

GitHub →

DistributedexecutionMulti-mo.record scale

012

GenAI · Fine-Tuning

LLM Fine-Tuning — QLoRA on Llama-2-7B

Llama-2-7B fine-tuned on Guanaco via QLoRA (4-bit + LoRA), consumer GPU viable. Configurable rank, dropout, and LR. ROUGE, BLEU, and perplexity comparison of fine-tuned vs. base model on held-out test set.

PyTorchHugging FacePEFTBitsAndBytesQLoRA

4-bitquantizationROUGE/BLEUevaluated

013

NLP · GenAI Application

DocBase — Multi-Modal Document Intelligence

RAG pipeline for document Q&A + Text-to-SQL for natural language database queries — one unified backend for unstructured and structured data. NLP-driven SQL generation improved query accessibility by 40% for non-technical users.

FAISSFastAPIStreamlitSQLite

GitHub →

40%query efficiencyDualdata modalities

03 — The Path

Experience &
Education

Sep 2024 – April 2026

Master of Professional Studies in Applied Machine Intelligence

Northeastern University, Boston, MA · GPA: 3.94/4.0

Coursework: Data Visualization, Enterprise Information Architecture, ML Systems
Projects spanning MLOps platforms, autonomous agents, 3D medical imaging, and distributed data engineering
Kaggle Top 250 globally out of 1,800+ teams — RSNA Intracranial Aneurysm Detection

January 2024 – May 2024

Data Science Intern

Futurense Technologies, Bangalore

Designed ETL pipelines ingesting data from 3+ heterogeneous source systems, resolving inconsistencies across 10,000+ records
Built KPI dashboards surfacing healthcare policy insights across 5 indicators for non-technical stakeholders
Automated preprocessing workflows reducing analyst data-wrangling time by ~30%

December 2021 – April 2023

Co-Founder & Analyst

ENFUME · D2C Luxury Perfume Brand

Launched brand from zero — built full storefront in WordPress, HTML, and CSS; drove initial customer acquisition
Tracked conversion funnels, identified key drop-off points, and implemented fixes improving site-to-checkout flow
Applied customer behavior analytics to guide product positioning, pricing, and growth strategy

2020 – 2024

B.Tech in Computer Science (AI & ML)

Jain (Deemed-to-be) University, Bengaluru · CGPA: 8.76/10

Specialized in Artificial Intelligence and Machine Learning
Published IEEE conference paper on AR in the Fashion Industry (2022)
Competed globally on Kaggle, achieving Top 250 ranking

04 — Research

Research &
Writing

IEEE Xplore · 2022

AR in Fashion Industries

Dwaj Ranka, Pratham Chopra, Ranvir M Mehta

Virtual trial room using OpenCV and Augmented Reality for real-time cloth simulation. Background/subject separation via color palette analysis and thresholding. Presented at the 4th IEEE International Conference on Advances in Computing, Communication Control and Networking.

Read on IEEE Xplore →

Research Paper

Robotics and AI in Industry 4.0

Dwaj Ranka, Neell Ravindra Ambere, Pratham Chopra, Ranvir M Mehta

Examined RPA and AI integration within Industry 4.0. Explored Neural Networks, Text Mining, and NLP for data extraction, classification, and process optimization.

Read Paper →

Research Paper

LAI: Voice Assistant with Emotional Response

Dwaj Ranka, Neell Ravindra Ambere, Pratham Chopra, Ranvir M Mehta

Paradigm giving voice assistants emotional intelligence using ML and audio preprocessing. Captures user emotions and generates contextually relevant responses with sentiment analysis.

Read Paper →

05 — Credentials

Certifications

Microsoft

Azure AI Fundamentals (AI-900)

January 2024 · Verify →

IIIT-B

Post Graduate Program in Data Science & AI

2024

DeepLearning.AI (Coursera)

Generative AI with LLMs

August 2023

Cognitive Class (IBM)

Machine Learning with Python

September 2023

Futurense Technologies

Prompt Engineering

February 2024

Udemy

Data Analysis with Pandas and Python

January 2023

Futurense Technologies

Data Visualization and Storytelling

January 2024

UC Irvine (Coursera)

Data Warehousing and Business Intelligence

December 2022

Daydream (Coursera)

Introduction to AR and ARCore

July 2023

PRATHAM CHOPRA

I turn hardproblems intoworking systems.

What I'veBuilt

RSNA Intracranial Aneurysm Detection

Helix — End-to-End MLOps Platform

INQUIRO — Autonomous Research Agent

GPU Compute Lab — CUDA Kernel Optimization

Customer Journey & Conversion Analytics

Air Quality Health Risk Classifier

British Airways — Customer Analytics & Booking Prediction

Speech Emotion Recognition

Autonomous ISS Viewing Planner Agent

AgentPay — ECDSA Cryptographic Ledger

NYC Taxi Trip Record — PySpark ETL Pipeline

LLM Fine-Tuning — QLoRA on Llama-2-7B

DocBase — Multi-Modal Document Intelligence

Experience &Education

Research &Writing

Certifications

LET'S BUILD SOMETHING.

I turn hard
problems into
working systems.

What I've
Built

Experience &
Education

Research &
Writing

LET'S
BUILD
SOMETHING.