ML Engineer · Boston, MA · Available
I build machines that learn — production ML systems, autonomous AI agents, and the infrastructure to keep them honest.
Recent MS graduate from Northeastern University (GPA 3.94, April 2026), building on a B.Tech in CS with AI & ML. I have shipped end-to-end ML systems — from CUDA kernel optimization and full MLOps platforms to autonomous multi-agent pipelines and production analytics dashboards. Currently seeking full-time roles in ML Engineering, AI Engineering, GenAI, Data Science, and Data Analytics.
Trained a 3D CNN ensemble (EfficientNet3D, ResNet3D, ViT3D) with SE attention and focal loss on 4,000+ 3D CTA scans. Engineered DICOM preprocessing pipelines slashing per-series time from 25 min to under 30 sec. AMP, 5-fold CV, TTA, full W&B tracking.
Full MLOps platform with LSTM anomaly detection and Airflow drift-triggered retraining cutting model staleness from weeks to under 24h. ONNX export + FastAPI serving at sub-100ms p99 latency. Live Evidently + Grafana observability across 5 metrics in 3 environments.
3-agent system orchestrating paper discovery, retrieval, and synthesis across Semantic Scholar, ArXiv, and PubMed via direct APIs — zero framework dependencies. Hybrid BM25 + dense retrieval with ChromaDB, custom 8-dimension ScholarEval scorer, 154 passing tests.
Custom CUDA kernels for matrix ops and memory-bound workloads. AMP training delivering +44% throughput with zero accuracy loss. GPU occupancy lifted from 52% to 80%+ via PyTorch Profiler and Nsight bottleneck resolution.
End-to-end analytics pipeline across millions of user events in a 3-stage funnel (View→Cart→Purchase). Cohort segmentation + A/B testing via proportions z-test across 50+ product segments. Interactive Streamlit dashboard for live stakeholder exploration.
Multi-class XGBoost pipeline predicting 5 EPA health risk tiers from real-time air quality data. EPA-breakpoint feature engineering, sample-weighting for class imbalance, SHAP per-prediction explainability, live Gradio dashboard with API integration.
VADER sentiment + topic modeling on customer reviews. 6 engineered features + Random Forest with Stratified K-Fold CV for booking prediction. Stakeholder-facing ROC curves, confusion matrices, and 3 actionable improvement recommendations.
End-to-end 8-class emotion classifier. Two-layer LSTM on RAVDESS + TESS achieving 85.34% test accuracy. MFCC, RMS, and ZCR feature pipeline with noise reduction, ModelCheckpoint + ReduceLROnPlateau for production serialization.
Custom ReAct (Reason+Act) loop built from scratch — no framework. Autonomously plans ISS viewing windows by orchestrating async calls across geolocation, satellite tracking, and weather APIs. Dockerized Streamlit UI with full reasoning-chain transparency and step-by-step thought visualization.
Tamper-evident payment ledger using ECDSA digital signatures. FastAPI REST backend for signing, verification, and querying. Full transaction lifecycle — key generation through multi-step verification — in a Dockerized stack.
Distributed PySpark ETL over multi-month NYC Yellow Taxi records. Trip duration, day-of-week, and long-trip flag engineering. Schema validation throughout; Parquet outputs optimized for downstream analytics and model training.
Llama-2-7B fine-tuned on Guanaco via QLoRA (4-bit + LoRA), consumer GPU viable. Configurable rank, dropout, and LR. ROUGE, BLEU, and perplexity comparison of fine-tuned vs. base model on held-out test set.
RAG pipeline for document Q&A + Text-to-SQL for natural language database queries — one unified backend for unstructured and structured data. NLP-driven SQL generation improved query accessibility by 40% for non-technical users.