About Me
I'm a passionate Machine Learning Engineer Currently pursuing M.Tech in Data Science and Engineering from BITS Pilani, I focus on building practical ML solutions that solve real-world problems with deployed applications.
Education
M.Tech in Data Science and Engineering
BITS Pilani
Work Integrated Learning Programme โข 2023 - 2025
Focusing on machine learning algorithms and statistical modeling. The program provides strong foundations in data science with practical industry applications.
Featured Projects
๐ต Hybrid Spotify Recommendation System
A production-grade hybrid recommender system combining content-based and collaborative filtering. Designed for scalability, performance, and automated CI/CD deployment on AWS.
- ๐ง Content-based filtering using cosine similarity on song metadata
- ๐ค Collaborative filtering on a sparse matrix of 9.7M users ร 34K songs (330M+ interactions)
- ๐ฏ Hybrid score: Weighted average of both approaches for better personalization
- โ๏ธ ML Workflow: Versioned pipelines with DVC & Git, artifact storage in AWS S3
- ๐ CI/CD: Docker โ GitHub Actions โ AWS ECR โ CodeDeploy (Blue/Green deployment)
- ๐ง Example: Recommends similar tracks to โLove Storyโ by Taylor Swift
๐ Bangalore Apartment Price Analyzer
A production-grade regression system for predicting Bangalore apartment prices based on property features. Designed for data-backed real estate decisions and built with a modern deployment stack.
- ๐ Market Snapshot Dashboard: Live stats like average price, price/sqft, and property listings.
- ๐ฎ Price Predictor: ML-powered tool to estimate apartment prices based on zone, area, BHK, and more.
- ๐ Analysis Dashboard: Visualize price trends, filter by zone/BHK/type, and view property distributions.
- ๐ก Apartment Recommender: Suggests similar apartments using location, price, and features (cosine similarity).
- ๐งช End-to-end MLOps: Model training, DVC pipeline, Git versioning, and FastAPI + Streamlit deployment.
- ๐ Web Scraping: Custom scraper using HTTPX and Selectolax for dynamic apartment data collection.
๐ Delivery Time Prediction App
A full-stack ML application that predicts food delivery times using real-world features and historical data. Built with versioned data pipelines, MLflow experiment tracking, and automated CI/CD deployment to AWS.
- ๐งช ML Pipeline: Data cleaning, feature engineering, training, and model selection via Optuna
- ๐ ML Workflow: DVC for data versioning, tracked via Git and stored on AWS S3
- ๐ Experiment Tracking: MLflow + Dagshub for experiment logging and model registry
- ๐ CI/CD: GitHub Actions โ Docker โ AWS ECR โ EC2/CodeDeploy with rolling updates
- ๐ API Endpoint: Deployed via FastAPI with a live Swagger UI
- ๐บ๏ธ Example: Predicts ETA for an order from Indore restaurant to urban delivery location with high traffic
๐ฌ YouTube Comment Sentiment Analyzer
A full-stack NLP application that analyzes YouTube comments in real-time using a custom Chrome extension, backed by a machine learning model served with Flask. Ideal for creators to assess public sentiment quickly and visually.
- ๐ Chrome Extension: Extracts YouTube comments on-page and sends data to the backend API
- ๐ง Sentiment Model: Logistic Regression trained on preprocessed YouTube data with TF-IDF features
- ๐งช Pipeline: Tokenization, stopword removal, n-grams, undersampling, and hyperparameter tuning
- ๐ฆ Backend: Flask-based REST API for serving predictions in real-time
- ๐ฌ Model Accuracy: Achieved 87.98% test accuracy on YouTube sentiment classification
- ๐ MLOps: MLflow for experiment tracking, DVC for pipeline versioning
- ๐ Deployment-ready: Extension and backend work together live in-browser
Technical Skills
Programming
Python, SQL
Machine Learning
Scikit-learn, XGBoost, Gradient Boosting, NLTK, Optuna
Data & Databases
Pandas, NumPy, MySQL, ETL Pipelines
Visualization
Matplotlib, Seaborn, Plotly, Power BI
Web Frameworks
Flask, Streamlit
MLOps
Git, DVC, MLflow, Docker, CI/CD Pipelines, AWS (EC2, CodeDeploy, ECR, S3)