Skip to content

divyanshmrawal/Credit-Card-Fraud-Detection-using-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Credit Card Fraud Detection using Machine Learning

Project Overview

Credit card fraud is a real-world problem where fraudulent transactions are very rare compared to legitimate ones. Because of this imbalance, machine learning models can easily become biased and fail to detect fraud.

In this project, I built a machine learning model to detect fraudulent credit card transactions while carefully handling the issue of highly imbalanced data.


Problem Statement

The main challenge of this project was dealing with class imbalance. Since fraud cases form only a small portion of the dataset, a model trained without proper handling may predict every transaction as normal and still achieve high accuracy.

This project focuses on:

  • Identifying fraud cases effectively
  • Avoiding misleading accuracy results
  • Evaluating the model using meaningful performance metrics

Solution Approach

To solve this problem, I followed a structured workflow:

  1. Data Exploration

    • Studied the dataset and class distribution
    • Identified severe imbalance between fraud and non-fraud transactions
  2. Data Preprocessing

    • Applied feature scaling for better model convergence
    • Split data into training and testing sets
  3. Handling Imbalanced Data

    • Used class weight balancing to give more importance to fraud cases
    • Focused on metrics beyond accuracy
  4. Model Building

    • Implemented Logistic Regression
    • Tuned model parameters to resolve convergence warnings
  5. Model Evaluation

    • Evaluated the model using precision, recall, F1-score, and confusion matrix
    • Prioritized recall to reduce missed fraud cases

Technologies Used

  • Python
  • Pandas & NumPy
  • Scikit-learn
  • Matplotlib & Seaborn
  • Jupyter Notebook

Dataset

The dataset used in this project is publicly available on Kaggle.

🔗 Dataset link: https://www.kaggle.com/

Due to GitHub file size limitations, the dataset is not included in this repository.


Results

  • Built a fraud detection model that performs effectively on imbalanced data
  • Improved the ability to identify fraudulent transactions
  • Learned how proper evaluation metrics impact real-world ML systems

What I Learned

This project helped me understand how different real-world machine learning problems are from theoretical examples. I learned that accuracy alone can be misleading when dealing with imbalanced datasets. I also gained hands-on experience with data preprocessing, feature scaling, handling convergence issues, and evaluating models using precision and recall instead of relying only on accuracy.


Future Improvements

  • Experiment with advanced models such as Random Forest and XGBoost
  • Apply oversampling techniques like SMOTE
  • Perform hyperparameter tuning
  • Deploy the model using a simple web application

Key Takeaway

Building a machine learning model is not just about choosing an algorithm. Understanding the data, handling imbalance, and selecting the right evaluation metrics are equally important for creating reliable real-world solutions.


Author

Divyansh Rawal Aspiring Data Scientist

About

This project detects fraudulent credit card transactions using machine learning. It addresses highly imbalanced data by applying preprocessing, feature scaling, and Logistic Regression. Model performance was evaluated using precision and recall to improve fraud detection and reduce missed fraud cases.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors