π Status: Complete - All pipeline steps successfully deployed!
This project builds an endβtoβend predictive maintenance system for small and large engines using sensor data (RPM, pressures, temperatures) to classify whether an engine is healthy or requires maintenance.
The work is organized to satisfy the provided interim and final report rubrics, including:
- β Data registration on Hugging Face
- β Exploratory Data Analysis (EDA)
- β Data preparation and dataset versioning
- β Model building with experimentation tracking
- β Model deployment with Docker + Streamlit on Hugging Face Spaces
- β Automated GitHub Actions workflow
Interactive Streamlit application for real-time engine condition predictions with sensor visualizations.
π¦ View Model on Hugging Face
Trained Random Forest model with hyperparameter tuning, versioned on Hugging Face Model Hub.
π Access Dataset Repository
Version-controlled datasets including raw data and train/test splits.
Complete source code, documentation, and CI/CD pipeline.
Automated CI/CD pipeline with 4 sequential jobs for data registration, preparation, training, and deployment.
mlops/
βββ data/ # Raw and processed data
β βββ engine_data.csv # Original engine sensor dataset
β βββ processed/ # Train/test splits
β βββ train.csv
β βββ test.csv
βββ notebooks/ # EDA and experimentation notebooks
βββ src/ # Main source code
β βββ config.py # Central configuration
β βββ data_register.py # Register raw data to HF Dataset
β βββ data_prep.py # Data cleaning and splitting
β βββ hf_data_utils.py # HF Dataset Hub utilities
β βββ train.py # Model training with MLflow
β βββ hf_model_utils.py # HF Model Hub utilities
β βββ inference.py # Prediction utilities
β βββ app.py # Streamlit web application
β βββ deploy_to_hf.py # Deploy to HF Space
βββ .github/
β βββ workflows/
β βββ pipeline.yml # CI/CD pipeline
βββ Dockerfile # Container definition for deployment
βββ requirements.txt # Python dependencies
βββ README.md # This file
src/config.pyβ Central configuration (paths, Hugging Face repo names, MLflow config)src/data_register.pyβ Registers raw dataset to Hugging Face Dataset Hubsrc/data_prep.pyβ Loads data, cleans it, and creates train/test splitssrc/train.pyβ Model training, hyperparameter tuning, MLflow loggingsrc/app.pyβ Streamlit web application for interactive predictionssrc/deploy_to_hf.pyβ Deploys app to Hugging Face Space.github/workflows/pipeline.ymlβ Automated CI/CD pipeline
The MLOps pipeline consists of 6 stages, automated via GitHub Actions:
- Script:
src/data_register.py - Action: Creates/uses Hugging Face dataset repo and uploads raw data
- Output:
ananttripathiak/engine-maintenance-datasetwithdata/engine_data.csv
- Script:
src/eda.py(or use notebooks) - Action: Performs data overview, univariate/bivariate/multivariate analysis
- Output: Visualizations and insights about engine health patterns
- Script:
src/data_prep.py - Action: Cleans data, creates train/test splits, uploads to dataset repo
- Output:
data/train.csvanddata/test.csvin dataset repo
- Script:
src/train.py - Action: Trains Random Forest with hyperparameter tuning, logs to MLflow, uploads best model
- Output:
ananttripathiak/engine-maintenance-modelwith trained model
- App:
src/app.py- Streamlit web application - Container:
Dockerfile- Container definition - Script:
src/deploy_to_hf.py- Deploys to Hugging Face Space - Output: Live app at
ananttripathiak/engine-maintenance-space
- File:
.github/workflows/pipeline.yml - Jobs:
register-datasetβ runssrc/data_register.pydata-prepβ runssrc/data_prep.pymodel-trainingβ runssrc/train.pydeploy-hostingβ runssrc/deploy_to_hf.py
- View Runs: GitHub Actions
-
Clone the repository:
git clone https://github.com/ananttripathi/engine-predictive-maintenance.git cd engine-predictive-maintenance -
Set up virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements.txt
-
Run pipeline steps:
# Register data python src/data_register.py # Prepare data python src/data_prep.py # Train model python src/train.py # Run app locally streamlit run src/app.py
The pipeline runs automatically on every push to main branch via GitHub Actions. View workflow runs at:
π GitHub Actions
- Python 3.10 - Programming language
- scikit-learn - Machine learning (Random Forest)
- MLflow - Experiment tracking and model registry
- Hugging Face Hub - Dataset, model, and space hosting
- Streamlit - Web application framework
- Docker - Containerization
- GitHub Actions - CI/CD automation
- Plotly - Interactive visualizations
This project is configured with:
- Hugging Face Username:
ananttripathiak - GitHub Username:
ananttripathi - Dataset Repo:
ananttripathiak/engine-maintenance-dataset - Model Repo:
ananttripathiak/engine-maintenance-model - Space Repo:
ananttripathiak/engine-maintenance-space - GitHub Repo:
ananttripathi/engine-predictive-maintenance
Update src/config.py with your Hugging Face username:
HF_DATASET_REPO = os.getenv("HF_DATASET_REPO", "ananttripathiak/engine-maintenance-dataset")
HF_MODEL_REPO = os.getenv("HF_MODEL_REPO", "ananttripathiak/engine-maintenance-model")
HF_SPACE_REPO = os.getenv("HF_SPACE_REPO", "ananttripathiak/engine-maintenance-space")Or set environment variables:
export HF_TOKEN="hf_your_token_here"
export HF_DATASET_REPO="ananttripathiak/engine-maintenance-dataset"
export HF_MODEL_REPO="ananttripathiak/engine-maintenance-model"
export HF_SPACE_REPO="ananttripathiak/engine-maintenance-space"A. Create GitHub Repository:
- Create a new repository on GitHub (e.g.,
engine-predictive-maintenance) - Push this
mlopsfolder to it:git init git add . git commit -m "Initial commit: Predictive maintenance MLOps pipeline" git remote add origin https://github.com/your-username/engine-predictive-maintenance.git git push -u origin main
B. Add GitHub Secrets: Go to your GitHub repo β Settings β Secrets and variables β Actions β New repository secret
Add these 4 secrets:
HF_TOKENβ Your Hugging Face access token (from https://huggingface.co/settings/tokens)HF_DATASET_REPOβ e.g.,ananttripathiak/engine-maintenance-datasetHF_MODEL_REPOβ e.g.,ananttripathiak/engine-maintenance-modelHF_SPACE_REPOβ e.g.,ananttripathiak/engine-maintenance-space
π For detailed setup instructions, see CONFIGURATION_GUIDE.md