📊 Tabular Churn Prediction (Baseline)

An end-to-end machine learning pipeline that predicts customer churn (whether a customer is likely to leave a service). Built with scikit-learn + FastAPI and using the Telco Customer Churn dataset.

🚀 Project Structure

tabular-churn-ml/
│
├── data/
│   ├── raw/         <- Original dataset (Telco churn CSV)
│   └── processed/   <- Train/val/test splits
│
├── models/          <- Saved trained model (churn_model.pkl)
│
├── notebooks/       <- Jupyter notebooks for exploration
│
├── src/
│   ├── download_data.py   # Download dataset
│   ├── data.py            # Split data into train/val/test
│   ├── train.py           # Train baseline model
│   └── app.py             # FastAPI app serving predictions
│
├── tests/           <- Unit tests (future)
│
├── requirements.txt <- Python dependencies
└── README.md        <- Project documentation

⚙️ Setup

1. Clone repo

git clone https://github.com/<your-username>/tabular-churn-ml.git
cd tabular-churn-ml

2. Create virtual environment

python3 -m venv venv
source venv/bin/activate   # Mac/Linux
venv\Scripts\activate      # Windows

3. Install dependencies

pip install -r requirements.txt

📥 Data

Download Telco Churn dataset from Kaggle automatically:

python3 src/download_data.py

Split into train/val/test:

python3 src/data.py

🏋️ Training

Train a baseline Logistic Regression model:

python3 src/train.py

This will:

Preprocess numeric & categorical features
Train a logistic regression model
Print evaluation metrics (AUC, F1, precision, recall)
Save model → models/churn_model.pkl

🌐 API (Serving Predictions)

Run FastAPI locally:

uvicorn src.app:app --reload

Open docs: http://127.0.0.1:8000/docs

Example request (JSON):

{
  "customerID": "1234-ABCD",
  "gender": "Female",
  "SeniorCitizen": 0,
  "Partner": "Yes",
  "Dependents": "No",
  "tenure": 5,
  "PhoneService": "Yes",
  "MultipleLines": "No",
  "InternetService": "Fiber optic",
  "OnlineSecurity": "No",
  "OnlineBackup": "Yes",
  "DeviceProtection": "No",
  "TechSupport": "No",
  "StreamingTV": "Yes",
  "StreamingMovies": "Yes",
  "Contract": "Month-to-month",
  "PaperlessBilling": "Yes",
  "PaymentMethod": "Electronic check",
  "MonthlyCharges": 89.5,
  "TotalCharges": 445.0
}

Example response:

{
  "prob": 0.734,
  "label": 1
}

✅ Definition of Done

Data pipeline (download_data.py, data.py)
Baseline churn model trained (train.py)
Model served via FastAPI (app.py)
Example request/response tested
Documentation (README.md)

📌 Next Steps (Future Sprints)

Add Streamlit/Gradio demo UI
Add SHAP explainability (top_factors)
Try tree-based models (XGBoost, LightGBM)
Add Dockerfile for deployment

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.idea		.idea
configs		configs
models		models
notebooks		notebooks
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Tabular Churn Prediction (Baseline)

🚀 Project Structure

⚙️ Setup

1. Clone repo

2. Create virtual environment

3. Install dependencies

📥 Data

🏋️ Training

🌐 API (Serving Predictions)

✅ Definition of Done

📌 Next Steps (Future Sprints)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊 Tabular Churn Prediction (Baseline)

🚀 Project Structure

⚙️ Setup

1. Clone repo

2. Create virtual environment

3. Install dependencies

📥 Data

🏋️ Training

🌐 API (Serving Predictions)

✅ Definition of Done

📌 Next Steps (Future Sprints)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages