Intro
About
Work
Journey
Research
Certs
Contact

ML Engineer  ·  Boston, MA  ·  Available

PRATHAM CHOPRA

I build machines that learn — production ML systems, autonomous AI agents, and the infrastructure to keep them honest.

3.94GPA · Northeastern
Top 250Kaggle · 1,800+ Teams
13+Projects Shipped
3Research Papers
scroll
01
About

I turn hard
problems into
working systems.

Recent MS graduate from Northeastern University (GPA 3.94, April 2026), building on a B.Tech in CS with AI & ML. I have shipped end-to-end ML systems — from CUDA kernel optimization and full MLOps platforms to autonomous multi-agent pipelines and production analytics dashboards. Currently seeking full-time roles in ML Engineering, AI Engineering, GenAI, Data Science, and Data Analytics.

Languages
PythonSQLC++CUDAHTMLCSS
Machine Learning & Deep Learning
PyTorchTensorFlowMONAIXGBoostScikit-learnKerasLSTMsCNNsTransformers
Generative & Agentic AI
LLMsHugging FaceChromaDBSentence-TransformersRAGReAct AgentsFAISSPrompt Engineering
MLOps & Infrastructure
DockerMLflowAirflowEvidentlyGrafanaONNXGitHub ActionsCI/CD
Data & Visualization
PandasPySparkPlotlyPower BIStreamlitGradioSHAP
02 — Selected Work

What I've
Built

13 Projects
001
Kaggle Competition · Medical Imaging

RSNA Intracranial Aneurysm Detection

Trained a 3D CNN ensemble (EfficientNet3D, ResNet3D, ViT3D) with SE attention and focal loss on 4,000+ 3D CTA scans. Engineered DICOM preprocessing pipelines slashing per-series time from 25 min to under 30 sec. AMP, 5-fold CV, TTA, full W&B tracking.

PyTorchMONAI3D CNNsMixed-PrecisionW&B
Top 250of 1,800+ teams50×faster preprocessing
002
MLOps Platform

Helix — End-to-End MLOps Platform

Full MLOps platform with LSTM anomaly detection and Airflow drift-triggered retraining cutting model staleness from weeks to under 24h. ONNX export + FastAPI serving at sub-100ms p99 latency. Live Evidently + Grafana observability across 5 metrics in 3 environments.

MLflowONNXFastAPIAirflowEvidentlyGrafanaDocker
<100msp99 latency3environments
003
Agentic AI · Multi-Agent System

INQUIRO — Autonomous Research Agent

3-agent system orchestrating paper discovery, retrieval, and synthesis across Semantic Scholar, ArXiv, and PubMed via direct APIs — zero framework dependencies. Hybrid BM25 + dense retrieval with ChromaDB, custom 8-dimension ScholarEval scorer, 154 passing tests.

PythonChromaDBSentence-TransformersDockerSemantic Scholar API
154passing tests<2 minturnaround
004
GPU Systems · Low-Level Optimization

GPU Compute Lab — CUDA Kernel Optimization

Custom CUDA kernels for matrix ops and memory-bound workloads. AMP training delivering +44% throughput with zero accuracy loss. GPU occupancy lifted from 52% to 80%+ via PyTorch Profiler and Nsight bottleneck resolution.

CUDAC++PyTorchAMPNsight
+44%throughput80%+GPU occupancy
005
Analytics · E-Commerce

Customer Journey & Conversion Analytics

End-to-end analytics pipeline across millions of user events in a 3-stage funnel (View→Cart→Purchase). Cohort segmentation + A/B testing via proportions z-test across 50+ product segments. Interactive Streamlit dashboard for live stakeholder exploration.

PythonPandasPlotlyStreamlitScikit-learn
50+product segmentsM+events processed
006
Explainable AI

Air Quality Health Risk Classifier

Multi-class XGBoost pipeline predicting 5 EPA health risk tiers from real-time air quality data. EPA-breakpoint feature engineering, sample-weighting for class imbalance, SHAP per-prediction explainability, live Gradio dashboard with API integration.

XGBoostSHAPScikit-learnGradioPlotly
5EPA risk tiersLiveAPI integration
007
NLP · Analytics

British Airways — Customer Analytics & Booking Prediction

VADER sentiment + topic modeling on customer reviews. 6 engineered features + Random Forest with Stratified K-Fold CV for booking prediction. Stakeholder-facing ROC curves, confusion matrices, and 3 actionable improvement recommendations.

NLTKScikit-learnPandasMatplotlib
6engineered features3insight categories
008
Deep Learning · Audio

Speech Emotion Recognition

End-to-end 8-class emotion classifier. Two-layer LSTM on RAVDESS + TESS achieving 85.34% test accuracy. MFCC, RMS, and ZCR feature pipeline with noise reduction, ModelCheckpoint + ReduceLROnPlateau for production serialization.

LSTMTensorFlowLibrosaScikit-learn
85.34%test accuracy8emotion classes
009
Agentic AI

Autonomous ISS Viewing Planner Agent

Custom ReAct (Reason+Act) loop built from scratch — no framework. Autonomously plans ISS viewing windows by orchestrating async calls across geolocation, satellite tracking, and weather APIs. Dockerized Streamlit UI with full reasoning-chain transparency and step-by-step thought visualization.

PythonStreamlitDockerREST APIs
Multitool orchestrationAsyncexecution
010
Cryptography · Backend

AgentPay — ECDSA Cryptographic Ledger

Tamper-evident payment ledger using ECDSA digital signatures. FastAPI REST backend for signing, verification, and querying. Full transaction lifecycle — key generation through multi-step verification — in a Dockerized stack.

ECDSAFastAPIDockerPostgreSQL
Tamper-evident recordsFulllifecycle mgmt
011
Data Engineering

NYC Taxi Trip Record — PySpark ETL Pipeline

Distributed PySpark ETL over multi-month NYC Yellow Taxi records. Trip duration, day-of-week, and long-trip flag engineering. Schema validation throughout; Parquet outputs optimized for downstream analytics and model training.

PySparkPythonPandasParquet
DistributedexecutionMulti-mo.record scale
012
GenAI · Fine-Tuning

LLM Fine-Tuning — QLoRA on Llama-2-7B

Llama-2-7B fine-tuned on Guanaco via QLoRA (4-bit + LoRA), consumer GPU viable. Configurable rank, dropout, and LR. ROUGE, BLEU, and perplexity comparison of fine-tuned vs. base model on held-out test set.

PyTorchHugging FacePEFTBitsAndBytesQLoRA
4-bitquantizationROUGE/BLEUevaluated
013
NLP · GenAI Application

DocBase — Multi-Modal Document Intelligence

RAG pipeline for document Q&A + Text-to-SQL for natural language database queries — one unified backend for unstructured and structured data. NLP-driven SQL generation improved query accessibility by 40% for non-technical users.

FAISSFastAPIStreamlitSQLite
40%query efficiencyDualdata modalities
03 — The Path

Experience &
Education

Sep 2024 – April 2026
Master of Professional Studies in Applied Machine Intelligence
Northeastern University, Boston, MA · GPA: 3.94/4.0
  • Coursework: Data Visualization, Enterprise Information Architecture, ML Systems
  • Projects spanning MLOps platforms, autonomous agents, 3D medical imaging, and distributed data engineering
  • Kaggle Top 250 globally out of 1,800+ teams — RSNA Intracranial Aneurysm Detection
January 2024 – May 2024
Data Science Intern
Futurense Technologies, Bangalore
  • Designed ETL pipelines ingesting data from 3+ heterogeneous source systems, resolving inconsistencies across 10,000+ records
  • Built KPI dashboards surfacing healthcare policy insights across 5 indicators for non-technical stakeholders
  • Automated preprocessing workflows reducing analyst data-wrangling time by ~30%
December 2021 – April 2023
Co-Founder & Analyst
ENFUME · D2C Luxury Perfume Brand
  • Launched brand from zero — built full storefront in WordPress, HTML, and CSS; drove initial customer acquisition
  • Tracked conversion funnels, identified key drop-off points, and implemented fixes improving site-to-checkout flow
  • Applied customer behavior analytics to guide product positioning, pricing, and growth strategy
2020 – 2024
B.Tech in Computer Science (AI & ML)
Jain (Deemed-to-be) University, Bengaluru · CGPA: 8.76/10
  • Specialized in Artificial Intelligence and Machine Learning
  • Published IEEE conference paper on AR in the Fashion Industry (2022)
  • Competed globally on Kaggle, achieving Top 250 ranking
04 — Research

Research &
Writing

IEEE Xplore · 2022
AR in Fashion Industries
Dwaj Ranka, Pratham Chopra, Ranvir M Mehta
Virtual trial room using OpenCV and Augmented Reality for real-time cloth simulation. Background/subject separation via color palette analysis and thresholding. Presented at the 4th IEEE International Conference on Advances in Computing, Communication Control and Networking.
Read on IEEE Xplore →
Research Paper
Robotics and AI in Industry 4.0
Dwaj Ranka, Neell Ravindra Ambere, Pratham Chopra, Ranvir M Mehta
Examined RPA and AI integration within Industry 4.0. Explored Neural Networks, Text Mining, and NLP for data extraction, classification, and process optimization.
Read Paper →
Research Paper
LAI: Voice Assistant with Emotional Response
Dwaj Ranka, Neell Ravindra Ambere, Pratham Chopra, Ranvir M Mehta
Paradigm giving voice assistants emotional intelligence using ML and audio preprocessing. Captures user emotions and generates contextually relevant responses with sentiment analysis.
Read Paper →
05 — Credentials

Certifications

Microsoft
Azure AI Fundamentals (AI-900)
January 2024  ·  Verify →
IIIT-B
Post Graduate Program in Data Science & AI
2024
DeepLearning.AI (Coursera)
Generative AI with LLMs
August 2023
Cognitive Class (IBM)
Machine Learning with Python
September 2023
Futurense Technologies
Prompt Engineering
February 2024
Udemy
Data Analysis with Pandas and Python
January 2023
Futurense Technologies
Data Visualization and Storytelling
January 2024
UC Irvine (Coursera)
Data Warehousing and Business Intelligence
December 2022
Daydream (Coursera)
Introduction to AR and ARCore
July 2023
06 — Let's Talk

LET'S
BUILD
SOMETHING.