# Shreejeet Sahay ## About Me I am a first year MS CS student @ University of Virginia. My research interest lies in **Deep Learning Theory**, and I am currently working as a **Graduate Research Assistant** under [**Dr. Hadi Daneshmand**](https://hadidaneshmand.github.io/dhadi.html). [Email](mailto:rge9ts@virginia.edu) | [LinkedIn](https://www.linkedin.com/in/shreejeet-sahay/) ## 📚 Education **University of Virginia** Charlottesville, VA *MS in Computer Science* (GPA: 3.862) | *August 2024 – May 2026* **Savitribai Phule Pune University** Pune, INDIA *Bachelor of Engineering in Information Technology (Merit Holder)* | *July 2015 – July 2019* --- ## 💼 Experience **Graduate Research Assistant** *University of Virginia – Charlottesville, VA* | *February 2025 – Present* - Working under Dr. Hadi Daneshmand **Graduate Teaching Assistant** *University of Virginia – Charlottesville, VA* | *September 2024 – Present* - Graduate TA for CS 1110/1111 Introduction to Programming, an undergraduate-level course in Python. - Developed autograders, evaluated assignments/exams, and provided feedback to 450+ students. **Data Engineer** *Atidiv – Pune, INDIA* | *July 2022 – July 2024* - Developed end-to-end ETL/ELT pipelines using Python, SQL, and open-source frameworks (Singer, Meltano). - Reduced processing time by 90% for Calendly Event Invitees Extractor using parallel processing. - Deployed, monitored, and optimized data pipelines in cloud environments like Databricks, Snowflake, and BigQuery. - Managed stakeholder requirements, wrote documentation, and conducted root cause analyses of pipeline failures. **Independent Research Volunteer** *Remote* | *August 2021 – July 2022* - Conducted research on "Vehicular CO2 Emissions Forecasting" under Dr. Pranav Pawar. - Implemented a 2-layer LSTM model to predict CO2 emissions using OBD-II data (Keras, scikit-learn, pandas). - Published research papers in IEEE Xplore, Springer, and other journals. - [IEEE Paper](https://ieeexplore.ieee.org/abstract/document/10099940) - [Springer Paper](https://link.springer.com/chapter/10.1007/978-3-031-67762-5_16) **Associate Software Engineer** *Informatica – Bangalore, INDIA* | *December 2019 – July 2021* - Developed and optimized Informatica pipelines using Java and Python, running on native and Spark engines. - Created ETL mappings and workflows for on-premises and cloud platforms (DEQ, DaaS, IICS). - Resolved ETL workflow failures, contributed to emergency bug fixes, and authored knowledge base articles. --- ## 🚀 Projects **Energy-based Occupancy Monitoring** | *Fall 2024* - Built a Random Forest Classifier to predict room occupancy using power data from the UVA Link Lab. - Verified ground truth using K-means clustering on CO2 data, cross-validated with PIR sensor readings. - Achieved over 94% accuracy across various rooms. - [Project Link](https://github.com/shreejeetsahay/SAHB) **GeoCLIP: Explainable Geographical Classification Using CLIP** | *Fall 2024* - Fine-tuned the CLIP model for country-level classification using street-view imagery. - Achieved Recall@1 of 77.8% and Recall@5 of 95.2%. - Developed a geolocation dataset using Google Maps Street View API, enhancing model explainability using attention maps. - [Project Link](https://github.com/slh3mm/GeoCLIP) --- ## 🛠️ Technical Skills **Languages:** Python, C, C++, SQL **Frameworks:** Singer, Meltano, TensorFlow, Flask, PyTorch, Keras, NumPy, pandas, Apache Kafka **Cloud & Data Management:** AWS, GCP, Snowflake, Databricks **Tools & DevOps:** Git, Apache Airflow, Rundeck, Docker, Kubernetes --- ## 🌐 Contact Information - **Email:** [rge9ts@virginia.edu](mailto:rge9ts@virginia.edu) - **LinkedIn:** [linkedin.com/in/shreejeetsahay](https://www.linkedin.com/in/shreejeet-sahay/)