MY PORTFOLIO DASHBOARD

Hi, I'm Purva, an AI Analyst specializing in LLMs, RAG systems and production ML infrastructure. Currently building CAVO at IpserLab—an AI-powered travel platform leveraging retrieval-augmented generation and recommendation systems. With expertise in Python, SQL, LangChain and MLOps I engineer end-to-end AI/ML solutions from data pipelines to deployed models. MS Analytics @ Northeastern (3.93 GPA) | Boston, MA. Explore my portfolio to see how I transform complex data challenges into scalable AI solutions

Hover to see impact metrics

Impact Metrics

15+
AI Models & Pipelines Deployed
500K+
Tokens Processed (NLP/LLM)
30-50%
Faster AI-Driven Decisions
40%
Improved Model Accuracy
1M+
Data Records Engineered
8 hrs
Automated Weekly via Data Pipelines
Hover away to return
Projects
hover over any project to see full details.
9 Case studies
01
Atomic Habits: Semantic Relationship Mapping
Python · LangChain · LlamaIndex · LangGraph · Hugging Face · FAISS · Chroma · Streamlit · PyVis · NetworkX
50+
Concepts mapped
RAG
Architecture
NLP · GenAI

RAG-based semantic mapping app with interactive data visualization for 50+ concepts from James Clear's "Atomic Habits." Built using LangChain, LlamaIndex, and LangGraph for advanced NLP processing, with FAISS and Chroma for vector similarity search. Features interactive PyVis concept network graphs and comprehensive semantic relationship mapping with NetworkX. Deployed on Streamlit for public access.

Concept nodes
>50
Tech complexity
9.5/10
RAG integration
Full
Load time
<2s
RAG Architecture LangChain LlamaIndex LangGraph Vector Search (FAISS) Semantic Networks Hugging Face Transformers
02
Respiratory Mortality Analysis System
Python · Scikit-learn · Pandas · NumPy · FastAPI · NGINX · Redis · AWS S3 · Celery · Alembic
200K+
CDC Records
Real-time
Predictions
Healthcare · ML

Scalable ML and data pipeline platform for real-time predictions on 200K+ CDC respiratory mortality records. Built production-grade infrastructure with FastAPI for high-performance API endpoints, NGINX for load balancing, Redis for caching, and Celery for asynchronous task processing. Deployed on AWS S3 with Alembic for database migrations. Features comprehensive data preprocessing with Pandas/NumPy and machine learning models using Scikit-learn.

Records processed
200K+
API response
<100ms
System uptime
99%
Scalability
High
FastAPI ML Pipeline Redis caching Celery workers AWS deployment Production-grade
03
Boston 311 Opioid Crisis: Geospatial Analytics & Forecasting
Python · Excel · Pandas · NumPy · Matplotlib · Seaborn · Power BI · SARIMA · DAX · PostgreSQL · ArcGIS
3M+
Records analyzed
15%
Demand reduction
Healthcare · Geospatial

ML forecasting platform analyzing over 3 million Boston 311 service request records, achieving 15% demand reduction post-policy implementation. Built comprehensive geospatial analytics using ArcGIS to map opioid-related incidents across neighborhoods. Developed SARIMA time-series forecasting models and interactive Power BI dashboards with advanced DAX measures. Features PostgreSQL database integration and Python-based statistical analysis with visualization using Matplotlib and Seaborn.

Records analyzed
3M+
Demand reduction
15%
Forecast accuracy
85%
Geographic zones
20+
Geospatial mapping (ArcGIS) SARIMA forecasting Power BI dashboards PostgreSQL Public health analytics
04
OptiChain: Supply Chain Optimization Dashboard
Power BI · DAX · Python · Power Query · SQL
35%
Faster decisions
180K+
Records analyzed
Supply Chain · BI

Built a comprehensive 5-page Power BI dashboard analyzing 180K+ supply chain records (2015-2017) to optimize inventory management, delivery performance, and profitability. Created custom DAX measures for Sales Velocity, Inventory Value at Risk, Dead Stock detection, and Net Gain per Order. Features interactive drillthrough pages, dynamic tooltips, and conditional formatting to identify fulfillment bottlenecks, shipping delays, and underperforming products—enabling data-driven decisions that reduce delays and minimize dead stock.

Decision speed
+35%
Pages created
5
Custom measures
15+
Data processed
180K
DAX calculations Power Query ETL Interactive dashboards KPI design Drillthrough pages Python data cleaning
Experience & Education
Hover over any bar to see role details, achievements and technologies used
Education
Experience
2020
2021
2022
2023
2024
2025
BS in Information Technology
BS in Information Technology
Mumbai University
Sept 2020 - May 2023
GPA 3.70/4.0
HDFC Bank
Data Engineer
HDFC Bank Ltd · India's Largest Private Bank
Aug 2022 - Aug 2023
500K+ records | ▼ 40% reduced lineage rework | ▲ 35% processing efficiency | ▲ 45% operational efficiency with real-time streaming
SQL • Python • AWS SageMaker • BigQuery • dbt • Apache Spark • Kafka • MLflow • TensorFlow Serving
MS in Analytics
Master of Science in Analytics
Northeastern University · Boston, MA
Sept 2023 - May 2025
GPA 3.93/4.0 | CPS Scholars and Leader Award
IpserLab
AI Analyst
IpserLab · Building CAVO: AI-Powered Travel Safety Platform
May 2025 - Present
▼ 30% failed pipeline runs | ▼ 25% iteration time | ▼ 30% low quality outputs | ▼ 30% model triage time | ▼ 40% setup effort for new experiments
Python • SQL • LangChain • RAG • LangGraph • Feast • FAISS • Docker • Kubernetes • AWS • Databricks • Streamlit • Power BI • GPT-4V • GitHub Actions • dbt • Delta Lake
Skills & Stack
Core competencies across AI/ML, data engineering, cloud infrastructure and analytics (hover over the cards)
6 Categories
Programming & Core Tools
96% Proficiency
Python SQL R Pandas NumPy FastAPI
AI/ML & LLMs
92% Proficiency
Scikit-learn XGBoost TensorFlow Serving TorchServe SARIMA Recommendation Systems LLMs RAG LangChain LlamaIndex LangGraph Haystack Hugging Face Transformers OpenAI API GPT-4V
Data Engineering & Databases
94% Proficiency
ETL Pipelines Data Modeling dbt Apache Spark Ray Kafka Celery Alembic NGINX PostgreSQL Redis Feast FAISS Chroma
MLOps & Infrastructure
89% Proficiency
MLflow Weights & Biases LangSmith DVC Phoenix (Arize AI) WhyLabs AWS (S3, SageMaker) Databricks Docker Kubernetes Terraform GitHub Actions CI/CD Prometheus Grafana
Business Intelligence & Analytics
93% Proficiency
Excel Power BI Tableau Looker Studio Streamlit DAX
Data Visualization & Geospatial
90% Proficiency
Matplotlib Seaborn PyVis NetworkX Geospatial Mapping ArcGIS
What People Say
Available for Opportunities

Drop me a Message for

Collaboration | Hiring | Project Work

📍 Boston, MA | Open to Relocation Across US