A machine learning project that applies the Random Forest Classifier to predict loan approval decisions. The project demonstrates how AI can support financial institutions in automating risk assessment and improving lending accuracy.
This repository contains a stand-alone implementation of a Random Forest model for loan approval classification. The main workflow is:
- Data Preprocessing – Cleaning and preparing the dataset (handling categorical features, missing values, etc.).
- Model Training – Building and tuning a Random Forest Classifier.
- Evaluation – Comparing accuracy, precision, recall, and F1-score.
- Bias & Fairness Considerations – Exploring how datasets can affect outcomes.
- Optimisation – Testing different hyperparameters to improve results.
Baseline accuracy: ~80%
Improved accuracy after tuning: up to 83%
Advantages: Handles categorical & numerical data well, resists overfitting, no need for normalisation.
This makes Random Forest a robust choice for loan approval prediction compared to single decision trees.
- Clone the repository:
git clone https://github.com/Arslan2003/Random_Forest_for_Loan-Approval.git
cd Random_Forest_for_Loan-Approval
- Create a Virtual Environment (Recommended):
# Using venv
python -m venv venv
# Activate the environment
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
- Install Dependencies:
pip install -r requirements.txt
- Dataset Check
Ensure the datasets (
train.csvandtest.csv) are in the repository root folder. - Run the Jupyter Notebook:
jupyter notebook RF_for_Loan_Approval.ipynb
- Experiment!
Contributions are welcome! Here are a few ideas for future improvements:
- Combine Random Forest with other classifiers (Ensemble Methods).
- Experiment with feature engineering to improve model interpretability.
- Expand dataset coverage for fairness and bias analysis.
- Deploy as a simple API for real-time predictions.
If you’d like to contribute, please fork the repo and submit a pull request after making your changes.
Arslonbek Ishanov - First-Class Data Scientist & AI/ML Enthusiast
This project is licensed under the MIT License – see the LICENSE file for details.
For an in-depth explanation of the methodology, experiments, and analysis, please refer to the report.
The datasets were downloaded from Kaggle.
Machine Learning Random Forest Classification Loan Approval AI in Finance Python Scikit-Learn