# Shreejeet Sahay
## About Me
I am a first year MS CS student @ University of Virginia. My research interest lies in **Deep Learning Theory**, and I am currently working as a **Graduate Research Assistant** under [**Dr. Hadi Daneshmand**](https://hadidaneshmand.github.io/dhadi.html).
[Email](mailto:rge9ts@virginia.edu) | [LinkedIn](https://www.linkedin.com/in/shreejeet-sahay/)
## 📚 Education
**University of Virginia**
Charlottesville, VA
*MS in Computer Science* (GPA: 3.862) | *August 2024 – May 2026*
**Savitribai Phule Pune University**
Pune, INDIA
*Bachelor of Engineering in Information Technology (Merit Holder)* | *July 2015 – July 2019*
---
## 💼 Experience
**Graduate Research Assistant**
*University of Virginia – Charlottesville, VA* | *February 2025 – Present*
- Working under Dr. Hadi Daneshmand
**Graduate Teaching Assistant**
*University of Virginia – Charlottesville, VA* | *September 2024 – Present*
- Graduate TA for CS 1110/1111 Introduction to Programming, an undergraduate-level course in Python.
- Developed autograders, evaluated assignments/exams, and provided feedback to 450+ students.
**Data Engineer**
*Atidiv – Pune, INDIA* | *July 2022 – July 2024*
- Developed end-to-end ETL/ELT pipelines using Python, SQL, and open-source frameworks (Singer, Meltano).
- Reduced processing time by 90% for Calendly Event Invitees Extractor using parallel processing.
- Deployed, monitored, and optimized data pipelines in cloud environments like Databricks, Snowflake, and BigQuery.
- Managed stakeholder requirements, wrote documentation, and conducted root cause analyses of pipeline failures.
**Independent Research Volunteer**
*Remote* | *August 2021 – July 2022*
- Conducted research on "Vehicular CO2 Emissions Forecasting" under Dr. Pranav Pawar.
- Implemented a 2-layer LSTM model to predict CO2 emissions using OBD-II data (Keras, scikit-learn, pandas).
- Published research papers in IEEE Xplore, Springer, and other journals.
- [IEEE Paper](https://ieeexplore.ieee.org/abstract/document/10099940)
- [Springer Paper](https://link.springer.com/chapter/10.1007/978-3-031-67762-5_16)
**Associate Software Engineer**
*Informatica – Bangalore, INDIA* | *December 2019 – July 2021*
- Developed and optimized Informatica pipelines using Java and Python, running on native and Spark engines.
- Created ETL mappings and workflows for on-premises and cloud platforms (DEQ, DaaS, IICS).
- Resolved ETL workflow failures, contributed to emergency bug fixes, and authored knowledge base articles.
---
## 🚀 Projects
**Energy-based Occupancy Monitoring** | *Fall 2024*
- Built a Random Forest Classifier to predict room occupancy using power data from the UVA Link Lab.
- Verified ground truth using K-means clustering on CO2 data, cross-validated with PIR sensor readings.
- Achieved over 94% accuracy across various rooms.
- [Project Link](https://github.com/shreejeetsahay/SAHB)
**GeoCLIP: Explainable Geographical Classification Using CLIP** | *Fall 2024*
- Fine-tuned the CLIP model for country-level classification using street-view imagery.
- Achieved Recall@1 of 77.8% and Recall@5 of 95.2%.
- Developed a geolocation dataset using Google Maps Street View API, enhancing model explainability using attention maps.
- [Project Link](https://github.com/slh3mm/GeoCLIP)
---
## 🛠️ Technical Skills
**Languages:** Python, C, C++, SQL
**Frameworks:** Singer, Meltano, TensorFlow, Flask, PyTorch, Keras, NumPy, pandas, Apache Kafka
**Cloud & Data Management:** AWS, GCP, Snowflake, Databricks
**Tools & DevOps:** Git, Apache Airflow, Rundeck, Docker, Kubernetes
---
## 🌐 Contact Information
- **Email:** [rge9ts@virginia.edu](mailto:rge9ts@virginia.edu)
- **LinkedIn:** [linkedin.com/in/shreejeetsahay](https://www.linkedin.com/in/shreejeet-sahay/)