Data Science Portfolio
Exploring the intersections of machine learning, data analysis, and real-world applications.
Projects
Delivery Route Optimization
Implemented A*, Nearest Neighbor, and Simulated Annealing algorithms to optimize delivery routes for 200 addresses, achieving up to 83.6% route improvement compared to random sampling and within 20.53% deviation from Google's OR-Tools optimal solution.
Optimizing CNN Architectures for Improved Image Classification on CIFAR-10
Boosted CIFAR-10 image classification accuracy from 65.04% to 73.38% by incorporating an additional layer and refining CNN architectures through targeted hyperparameter tuning and a thorough grid search of 48 configurations to pinpoint the optimal setup.
Trends in the Diversity of Nobel Prize Winners
Conducted an extensive analysis on Nobel Prize laureates to uncover evolving trends in diversity and shifts in geographical dominance, using Python, Pandas, and statistical methods to examine disparities in gender, nationality, and prize sharing.
FIFA Player Attributes Clustering Analysis
Conducted a performance analysis of FIFA player data using different unsupervised clustering methods with Scikit-learn, finding that UMAP for dimension reduction combined with GMM for clustering achieved the highest effectiveness, evidenced by an ARI of 0.498 and a silhouette score of 0.62.
NBA Players' Physique vs. Draft Pick Analysis
Uncovered a significant trend in NBA draft picks, preferring taller players with lower weight and BMI, validated by a p-value < 0.05, through data analysis and statistical testing.
Comparative Analysis of Machine Learning Classifiers
Analyzed the performance of three classification models—Random Forest, SVM, Neural Networks—across three UCI datasets, finding Random Forest to have the best mean test accuracy of 93%.
Evolution of Political Priorities in Democracies and Autocracies
Analyzed the evolution of political priorities in democracies and autocracies using the UNGDC (1970-2014), employing STM in R to identify 20 key topics and reveal significant historical events and geopolitical developments.
Sentiment Analysis with BERT
Developed a sentiment analysis model using BERT to evaluate and categorize user sentiments from complex data sets, employing TensorFlow for model training and BeautifulSoup for data parsing.
Professional Experience
Taylor Guitars
Business Intelligence Intern | Jul 2024 - Sep 2024
El Cajon, San Diego, CA
- Developed and implemented an NLP model to automate problem classification in guitar repair and production notes, achieving 80%+ accuracy and saving 300+ hours of manual labor.
- Developed automated reporting system (daily, weekly, monthly) integrating the Chartmetric API to quantify Taylor artists' influence, utilizing Python for data processing and SQL for database storage.
- Created insightful production analysis dashboards using Tableau Desktop, leveraging Tableau Prep for efficient data preparation.