A collection of machine learning projects built to explore different problem types, algorithms, and workflows — ranging from basic regression to more advanced classification and custom metric-based systems.
Each project follows a consistent and scalable structure:
Project_Name/
│
├── Data_Viz_Data/ # Data Visualization Data and EDAs
├── Model_Source_Code/ # Jupyter notebooks & Dataset files
│ ├── electricity_data.csv
| ├── initial.ipynb # Experimental / testing notebook
│ └── model_.ipynb # Finalized model implementation
└── README.md # Project-specific documentation
Each project follows a structured development pipeline:
- Used for experimentation and testing
- Feature engineering, EDA, trying different models
- Safe space for breaking things and iterating
- Clean, finalized version of the model
- Only stable and verified logic is included
- Represents the “production-ready” notebook
Any new idea or modification is first tested in
initial.ipynb, and once validated, transferred tomodel_.ipynb.
- Each project includes a sample dataset (100–200 rows) to allow quick setup and testing without heavy downloads.
- Due to the large size of full datasets, they are not stored in this repository. Instead, Kaggle dataset links are provided in each model’s README for easy access.
- This approach ensures:
- ⚡ Fast cloning of the repository
- 🧪 Easy experimentation with sample data
- 📦 Access to complete datasets when needed
- ⚡ Electricity Consumption Prediction
- 🌡️ CPU Temperature Prediction
- 📖 Student Marks Prediction
- 💵 Bank Fraud Prediction
- Regression & Classification
- Feature Engineering
- Data Visualization (EDA)
- Model Evaluation Metrics
- Handling Different Dataset Types
- Iterative Model Development
- Add advanced models (ensemble, boosting, etc.)
- Hyperparameter tuning
- Model deployment (Flask / API)
- Performance comparison across models
- Centralized experiment tracking
- Python
- NumPy, Pandas
- Matplotlib, Seaborn
- Scikit-learn
- Jupyter Notebook
This repository is built with a strong focus on:
- Structured experimentation
- Clean separation between testing and final models
- Consistency across projects
- Kaggle knowledge
Hardik Basu