Skip to content

haison19952013/Personal-Data-Science-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Son Le's Data Science Portfolio

This repository showcases hands-on data science and analytics projects across machine learning, experimentation, forecasting, business intelligence, and applied Python.

Why this portfolio stands out

  • End-to-end project work: data extraction, cleaning, modeling, evaluation, and communication
  • Business-focused problem solving across product, finance, and operations use cases
  • Experience with both notebooks and deployable Python applications
  • Evidence of practical delivery: dashboards, APIs, forecasting workflows, and Kaggle results

Featured projects

Project Focus Tools Outcome
Titanic Survival Prediction End-to-end classification and model interpretation Python, scikit-learn, SHAP Reached about 78% test accuracy and ranked in the top 23% on Kaggle
Stock Volatility Forecasting In Vietnam ETL, time-series forecasting, and API development Python, Pandas, SQLite, GARCH, FastAPI Built a workflow that collects market data, trains a forecasting model, and serves predictions
Data Engineering Journey Data engineering portfolio spanning ETL, streaming, analytics engineering, and cloud workflows Python, SQL, Bash, Kafka, Spark, PostgreSQL, MongoDB, Airflow, dbt Demonstrates progression from core data skills to end-to-end pipeline design across multiple production-style projects
AB Testing Advertising Campaign Experiment analysis and statistical testing Python, visualization, t-test, permutation test Evaluated whether a test campaign outperformed a control campaign and identified no significant difference
Digital Rights Management BI reporting and usage tracking SQL, MySQL, Power BI / BI dashboarding Built reporting to monitor DRM key usage and support forecasting and budgeting

Project categories

Machine learning

  • Classification and predictive modeling
  • Feature engineering and model comparison
  • Model interpretation and decision support

Analytics and experimentation

  • Exploratory data analysis
  • Diagnostic analysis
  • A/B testing and hypothesis testing

Forecasting and data products

  • Time-series modeling
  • ETL pipelines
  • API-oriented analytics workflows

Business intelligence

  • SQL-based data extraction
  • Dashboard design
  • Operational reporting

Repository structure

  • Titanic Survival Prediction/ — classification project based on the Titanic dataset
  • Stock Volatility Forecasting In Vietnam/ — volatility forecasting workflow and supporting Python modules
  • Data Engineering Journey/ — featured overview of a separate repository focused on ETL, streaming, analytics engineering, and cloud data pipelines
  • AB Testing Advertising Campaign/ — experiment analysis notebook and documentation
  • Digital Rights Management/ — BI dashboard assets and supporting project files
  • Fraud Detection/, Recipe_Site_Traffic_Prediction/, Shop_the_Look_Recommender/ and more — additional portfolio work in progress or under expansion

Core tools used across projects

Python Pandas scikit-learn SQL SQLite MySQL PostgreSQL FastAPI Kafka Spark dbt Airflow MongoDB Power BI Data Visualization Hypothesis Testing Time-Series Forecasting

What recruiters and hiring managers can expect

  • A portfolio built around real analytical workflows, not just isolated models
  • Projects that connect technical execution with measurable business questions
  • Clear evidence of curiosity across multiple data domains and problem types

Contact

If you are hiring for data science, machine learning, analytics, or BI roles, feel free to reach out:

License

This repository is licensed under the GNU General Public License v3.0.

About

Welcome to my Data Science Projects Repository! This repository contains a collection of my data science projects, showcasing my skills and expertise in the field. Each project demonstrates different aspects of data analysis, machine learning, and visualization.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors