This repository showcases hands-on data science and analytics projects across machine learning, experimentation, forecasting, business intelligence, and applied Python.
- End-to-end project work: data extraction, cleaning, modeling, evaluation, and communication
- Business-focused problem solving across product, finance, and operations use cases
- Experience with both notebooks and deployable Python applications
- Evidence of practical delivery: dashboards, APIs, forecasting workflows, and Kaggle results
| Project | Focus | Tools | Outcome |
|---|---|---|---|
| Titanic Survival Prediction | End-to-end classification and model interpretation | Python, scikit-learn, SHAP | Reached about 78% test accuracy and ranked in the top 23% on Kaggle |
| Stock Volatility Forecasting In Vietnam | ETL, time-series forecasting, and API development | Python, Pandas, SQLite, GARCH, FastAPI | Built a workflow that collects market data, trains a forecasting model, and serves predictions |
| Data Engineering Journey | Data engineering portfolio spanning ETL, streaming, analytics engineering, and cloud workflows | Python, SQL, Bash, Kafka, Spark, PostgreSQL, MongoDB, Airflow, dbt | Demonstrates progression from core data skills to end-to-end pipeline design across multiple production-style projects |
| AB Testing Advertising Campaign | Experiment analysis and statistical testing | Python, visualization, t-test, permutation test | Evaluated whether a test campaign outperformed a control campaign and identified no significant difference |
| Digital Rights Management | BI reporting and usage tracking | SQL, MySQL, Power BI / BI dashboarding | Built reporting to monitor DRM key usage and support forecasting and budgeting |
- Classification and predictive modeling
- Feature engineering and model comparison
- Model interpretation and decision support
- Exploratory data analysis
- Diagnostic analysis
- A/B testing and hypothesis testing
- Time-series modeling
- ETL pipelines
- API-oriented analytics workflows
- SQL-based data extraction
- Dashboard design
- Operational reporting
Titanic Survival Prediction/— classification project based on the Titanic datasetStock Volatility Forecasting In Vietnam/— volatility forecasting workflow and supporting Python modulesData Engineering Journey/— featured overview of a separate repository focused on ETL, streaming, analytics engineering, and cloud data pipelinesAB Testing Advertising Campaign/— experiment analysis notebook and documentationDigital Rights Management/— BI dashboard assets and supporting project filesFraud Detection/,Recipe_Site_Traffic_Prediction/,Shop_the_Look_Recommender/and more — additional portfolio work in progress or under expansion
Python Pandas scikit-learn SQL SQLite MySQL PostgreSQL FastAPI Kafka Spark dbt Airflow MongoDB Power BI Data Visualization Hypothesis Testing Time-Series Forecasting
- A portfolio built around real analytical workflows, not just isolated models
- Projects that connect technical execution with measurable business questions
- Clear evidence of curiosity across multiple data domains and problem types
If you are hiring for data science, machine learning, analytics, or BI roles, feel free to reach out:
- Email:
haison19952013@gmail.com - GitHub: haison19952013
This repository is licensed under the GNU General Public License v3.0.