Portfolio
Capstone
- Arria Boost: Sports Analytics Data Storytelling (Report)
- Engineered a probabilistic machine learning model for NCAA Division I soccer rankings, surpassing professional soccer baseline predictions by 5%.
- Validated model adaptability and enhancement across seasons through continuous learning.
- Created a dynamic, user-interactive dashboard for real-time analytics, enhancing decision-making with comprehensive team performance insights.
- Achieved 83.3% accuracy in men’s team field predictions by modeling match draws and tournament projections, outperforming FIFA rankings in predictive accuracy.
- Feature Engineering, Statistical Machine Learning Modeling, Data Visualization and Pipeline Automation.
Statistical Modeling
- StockX Sneaker Analysis
- Analyzed StockX 2019 data to identify factors influencing sneaker resale value, using R for multiple linear regression, highlighting significant variables affecting “hype value.”
- Job Training on Wages Analysis
- Employed logistic regression to evaluate the impact of job training on wage increases, demonstrating higher likelihoods of wage improvements with training.
Cloud Data Engineering
- Strava Microservice - Continuous Delivery of FastAPI Data Engineering API on AWS
- Developed a FastAPI microservice to deliver performance data visualizations via JSON, integrated CI/CD with GitHub Actions for seamless deployment on AWS.
- Strava CLI
- Crafted a Python CLI for Strava activity data, utilizing Click for user-friendly interaction.
- Marathon Finish Predictor
- Implemented a Flask-based service predicting marathon times using machine learning.
- Stroke Predictor Microservice
- Created a Flask and sklearn application predicting stroke likelihood, containerized with Docker for easy deployment.
- Product Recommendation Engine
Unifying Data Science/Causal Inference
- The Legacy of the Taliban on Female Education in Pakistan (Report)
- Investigated Taliban’s impact on female education through causal inference, revealing complex effects on school enrollment rates.
Machine Learning
- Model-Based Approach to Music Genre Assignment (Report // Presentation)
- Developed machine learning models for music genre classification using Spotify data, improving genre assignment accuracy.
Mobile Application Development
- Mobile Passport
- Developed mobile applications enhancing visitor experiences in NC state parks
- Duke Roster
- Developed a mobile application that presents information about students within a course taken at Duke University.
Duke Data Plus
- Plus Explorations Project
- Led a team in analyzing Data+, Code+, and CS+ program data over 8 years, enhancing decision-making and program efficacy with a comprehensive dashboard.
Algorithmic Trading - Financial Data and Modeling
- Powerful Pairs Trading
- Applied K-Means Clustering for strategic pairs trading across sectors, demonstrating a novel approach to algorithmic trading.