Skip to content

rutujdv/DiabetesDataAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Diabetes Data Analysis

Overview This project performs an in-depth analysis of diabetes data, involving data cleaning, exploratory data analysis, unsupervised learning, and machine learning model training. The goal is to identify patterns and build predictive models for diabetes diagnosis.

Features Implemented Data Cleaning- Handling missing values by replacing zeros with median values. Exploratory Data Analysis (EDA)- Correlation heatmap visualization, Dimensionality reduction using PCA, UMAP, and t-SNE. Unsupervised Learning- K-Means clustering for pattern recognition. Supervised Learning Models- Logistic Regression, Random Forest Classifier, XGBoost Classifier Model Evaluation- Accuracy computation, Cross-validation for model performance assessment. Hyperparameter Tuning- GridSearchCV for optimizing Random Forest parameters. Feature Selection- Selecting the most important features using Random Forest. Regularization Experiment- L2 Regularization applied to Logistic Regression.

Dataset The dataset used is the diabetes.csv file, which contains relevant features for diabetes prediction.

Installation and Requirements

Dependencies: Ensure you have the following Python libraries installed: pip install pandas numpy seaborn matplotlib scikit-learn umap-learn xgboost

How to Run Place diabetes.csv in the appropriate directory. Run the Python script to execute the analysis: python diabetes_analysis.py

Results Logistic Regression Accuracy: 76.62% Random Forest Accuracy: 76.62% XGBoost Accuracy: 73.38% Optimized Random Forest CV Accuracy: 78.02% Feature Selection Improved Accuracy: 74.68% Regularized Logistic Regression Accuracy: 76.62%

Future Improvements Experiment with additional feature engineering techniques. Try deep learning models for better prediction accuracy. Expand the dataset with external medical records for improved insights.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors