Skip to content
View pratiksingh1296's full-sized avatar

Block or report pratiksingh1296

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pratiksingh1296/README.md

Hi, I'm Pratik πŸ‘‹

I'm a self-taught Data Scientist based in Navi Mumbai, focused on building ML systems where predictions are reliable, explainable, and directly useful for decisions β€” not just accurate on a leaderboard.

My work centres around a consistent theme: probabilistic modeling, uncertainty quantification, and calibrated predictions that translate into real business logic.


πŸ” What I Work On

  • Probabilistic modeling β€” calibrated probabilities over hard classifications
  • Uncertainty quantification β€” prediction intervals, confidence estimation
  • Explainability β€” SHAP-based model transparency for regulated domains
  • Time-series forecasting β€” demand forecasting with feature engineering
  • Simulation β€” Monte Carlo methods for season-level uncertainty

πŸ“‚ Featured Projects

End-to-end credit risk pipeline predicting loan default probability on the Home Credit dataset.

  • Platt Scaling calibration reducing ECE from 0.041 β†’ 0.004
  • Risk bucketing (Low / Medium / High / Very High) aligned with lending policy
  • SHAP explainability for individual applicant decisions and regulatory transparency
  • Python Scikit-learn XGBoost SHAP

Hourly electricity demand forecasting on real EIA grid data (Texas, 2018–2023).

  • Time-series feature engineering: lag features, rolling stats, cyclical encoding
  • XGBoost achieving 2.40% MAPE β€” 48% improvement over seasonal naive baseline
  • Weather integration via Open-Meteo API
  • Python XGBoost Scikit-learn Pandas

Probabilistic match outcome modeling with explicit focus on draw modeling.

  • Calibrated Home / Draw / Away probabilities using Platt Scaling
  • Expected Points (xPts) league table from match-level probabilities
  • 10,000 Monte Carlo season simulations for title, top-4, and relegation probabilities
  • Python XGBoost Monte Carlo Simulation

πŸ› οΈ Tech Stack

Python Scikit-learn XGBoost Pandas NumPy Git


πŸ“« Connect

LinkedIn GitHub

Pinned Loading

  1. credit-risk-modeling credit-risk-modeling Public

    End-to-end credit risk modeling pipeline with probability calibration, risk bucketing, and SHAP explainability.

    Jupyter Notebook

  2. premier-league-forecasting premier-league-forecasting Public

    Probabilistic Premier League match forecasting with calibrated predictions and Monte Carlo season simulations.

    Jupyter Notebook

  3. electricity-demand-forecasting electricity-demand-forecasting Public

    Electricity demand forecasting using time-series feature engineering and ML models (Linear Regression, Random Forest, XGBoost) with strong baseline comparison.

    Jupyter Notebook