Heart Failure Clinical Records: Classification and Clustering Analysis

Abstract

Heart failure is a leading cause of mortality worldwide, with an estimated 64.3 million people affected globally. Early prediction of adverse outcomes using routinely collected clinical data can substantially improve patient management and resource allocation. This project applies both supervised and unsupervised machine learning techniques to the Heart Failure Clinical Records dataset (Chicco & Jurman, 2020) to predict patient mortality. We implement and compare two deep learning classifiers — a Convolutional Neural Network (CNN) and a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) — alongside two unsupervised approaches — K-Means clustering and a custom Self-Organizing Map (SOM). A lightweight Random Forest model is also trained and exported for real-time inference through an interactive Streamlit web application.

Background

Cardiovascular diseases account for approximately 31% of all global deaths (WHO, 2021). Heart failure specifically occurs when the heart cannot pump sufficient blood to meet the body's needs. Timely prediction of patient outcomes following a heart failure episode is critical for clinical decision-making.

Machine learning has shown significant promise in this domain. Chicco & Jurman (2020) demonstrated that a simple set of 12 clinical features, routinely measured during follow-up visits, could predict patient survival with high accuracy. This project builds upon that foundation by:

Conducting rigorous exploratory data analysis with statistical feature selection
Implementing and comparing multiple deep learning architectures
Applying unsupervised clustering methods to uncover latent patient subgroups
Deploying the best-performing model in an accessible web interface

Dataset

Source: UCI Machine Learning Repository Citation: Chicco, D., & Jurman, G. (2020). Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Medical Informatics and Decision Making, 20(1), 1–16. https://doi.org/10.1186/s12911-020-1023-5

Property	Value
Instances	299
Features	12 clinical predictors + 1 target
Missing Values	None
Class Balance	67.9% survived / 32.1% deceased
Follow-up Period	4–285 days

Feature Description

Feature	Type	Unit	Description
`age`	Numeric	years	Patient age
`anaemia`	Binary	—	Decrease of red blood cells (haemoglobin)
`creatinine_phosphokinase`	Numeric	mcg/L	Level of CPK enzyme in blood
`diabetes`	Binary	—	Presence of diabetes
`ejection_fraction`	Numeric	%	Percentage of blood leaving the heart per contraction
`high_blood_pressure`	Binary	—	Presence of hypertension
`platelets`	Numeric	kiloplatelets/mL	Platelet count in blood
`serum_creatinine`	Numeric	mg/dL	Creatinine level in blood serum
`serum_sodium`	Numeric	mEq/L	Sodium level in blood serum
`sex`	Binary	—	Biological sex (0=female, 1=male)
`smoking`	Binary	—	Smoking status
`time`	Numeric	days	Follow-up period duration
`DEATH_EVENT`	Binary	—	Target: 1 = deceased, 0 = survived

Methodology

1. Data Preprocessing

Unit Normalization: Platelet counts converted to kiloplatelets/mL for consistency
Feature Renaming: creatinine_phosphokinase → CPK for readability
Scaling: StandardScaler (zero mean, unit variance) for neural network inputs; MinMaxScaler for SOM
Class Imbalance: SMOTE (Synthetic Minority Over-sampling Technique) applied to training set only

2. Feature Selection

Chi-square independence tests assessed the statistical association between binary features and the target variable (DEATH_EVENT). Features with p > 0.05 (sex, high blood pressure, diabetes) were identified as less discriminative, though all 12 features were retained for the full models to avoid information loss.

3. Models

Convolutional Neural Network (CNN)

The input features are reshaped into a 1D sequence (12 × 1) and processed through three Conv1D layers with increasing filter sizes (64, 128, 256), interleaved with MaxPooling1D and Dropout (rate = 0.25) layers. A final dense layer with softmax activation produces class probabilities.

Hyperparameter	Value
Activation (Conv)	Sigmoid
Optimizer	Adam
Loss	Binary cross-entropy
Epochs	1000 (EarlyStopping, patience=30)
Batch Size	32

LSTM-RNN (Recurrent Neural Network)

The input is reshaped to a 3D tensor for recurrent processing. A single LSTM layer (64 units) captures temporal dependencies across the feature sequence, followed by dropout and dense layers.

Hyperparameter	Value
LSTM Units	64
Activation (LSTM)	Tanh
Optimizer	Adam
Validation Split	20%
Epochs	1000 (EarlyStopping, patience=30)

K-Means Clustering

Unsupervised partitioning into k=2 clusters (corresponding to survival outcomes) using Euclidean distance. Cluster quality evaluated via Silhouette Score and Adjusted Rand Index.

Self-Organizing Map (SOM)

A custom implementation of a 25×25 competitive learning network. Weights are updated iteratively using decaying learning rates and neighborhood radii. After training, each neuron is labeled by majority vote from mapped training samples, enabling classification of unseen patients.

Hyperparameter	Value
Grid Size	25 × 25
Max Learning Rate	0.4
Max Neighborhood Distance	4
Training Steps	150,001

Random Forest (Deployment Model)

A 300-tree Random Forest classifier trained on the full feature set with SMOTE-balanced data. Serialized via joblib for use in the Streamlit application.

Results

Classification Performance

Model	Train Accuracy	Test Accuracy	Notes
CNN	88.76%	73.3%	Overfitting observed
LSTM-RNN	90.00%	72.2%	Overfitting observed
Random Forest	~98%	~83%	Best supervised model
SOM	—	96.32%	Unsupervised; competitive labeling

Clustering Metrics (K-Means, k=2)

Metric	Value
Silhouette Score	0.801
Adjusted Rand Index	Computed
Within-Cluster Sum of Squares	Computed

Key Findings

SOM achieved the highest classification accuracy (96.32%), demonstrating that unsupervised topological learning can outperform supervised deep learning on this dataset.
Both CNN and RNN overfit significantly — training accuracy exceeded test accuracy by ~15–18 percentage points — suggesting the 299-sample dataset is insufficient for complex deep learning architectures without stronger regularization or augmentation.
K-Means produced well-separated clusters (Silhouette = 0.801), indicating that the 12 clinical features encode genuinely distinct patient subpopulations.
Feature selection via chi-square identified time (follow-up duration) and serum_creatinine as the most predictive features, consistent with Chicco & Jurman (2020).

Interactive Demo

A Streamlit web application is included for real-time prediction:

Input patient clinical values manually or generate a random synthetic patient
View survival probability with a confidence gauge
Inspect feature importance rankings
Compare against population statistics

Repository Structure

Heart_Failure_Classification/
├── Heart_Failure_Classification.ipynb   # Main analysis notebook
├── app.py                               # Streamlit prediction UI
├── train_model.py                       # Script to train & save Random Forest model
├── heart_failure_clinical_records_dataset.csv
├── requirements.txt
├── README.md
└── models/
    └── rf_heart_failure.pkl             # Saved Random Forest model (generated by train_model.py)

Installation

# Clone the repository
git clone https://github.com/Abhi183/Heart_Failure_Classification.git
cd Heart_Failure_Classification

# Create virtual environment
python -m venv venv
source venv/bin/activate       # macOS/Linux
# venv\Scripts\activate        # Windows

# Install dependencies
pip install -r requirements.txt

Usage

Run the Jupyter Notebook

jupyter notebook Heart_Failure_Classification.ipynb

Train and Save the Random Forest Model

python train_model.py

This generates models/rf_heart_failure.pkl.

Launch the Streamlit App

streamlit run app.py

Navigate to http://localhost:8501 in your browser.

References

Chicco, D., & Jurman, G. (2020). Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Medical Informatics and Decision Making, 20(1), 1–16. https://doi.org/10.1186/s12911-020-1023-5
Dua, D., & Graff, C. (2019). UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357.
Kohonen, T. (1990). The Self-Organizing Map. Proceedings of the IEEE, 78(9), 1464–1480.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.

License

This project is licensed under the MIT License. Dataset: Original dataset licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).

DSDA 385 — Abhishek Shekhar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Failure Clinical Records: Classification and Clustering Analysis

Abstract

Table of Contents

Background

Dataset

Feature Description

Methodology

1. Data Preprocessing

2. Feature Selection

3. Models

Convolutional Neural Network (CNN)

LSTM-RNN (Recurrent Neural Network)

K-Means Clustering

Self-Organizing Map (SOM)

Random Forest (Deployment Model)

Results

Classification Performance

Clustering Metrics (K-Means, k=2)

Key Findings

Interactive Demo

Repository Structure

Installation

Usage

Run the Jupyter Notebook

Train and Save the Random Forest Model

Launch the Streamlit App

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
.gitignore		.gitignore
Heart_Failure_Classification.ipynb		Heart_Failure_Classification.ipynb
README.md		README.md
app.py		app.py
heart_failure_clinical_records_dataset.csv		heart_failure_clinical_records_dataset.csv
requirements.txt		requirements.txt
train_model.py		train_model.py

Folders and files

Latest commit

History

Repository files navigation

Heart Failure Clinical Records: Classification and Clustering Analysis

Abstract

Table of Contents

Background

Dataset

Feature Description

Methodology

1. Data Preprocessing

2. Feature Selection

3. Models

Convolutional Neural Network (CNN)

LSTM-RNN (Recurrent Neural Network)

K-Means Clustering

Self-Organizing Map (SOM)

Random Forest (Deployment Model)

Results

Classification Performance

Clustering Metrics (K-Means, k=2)

Key Findings

Interactive Demo

Repository Structure

Installation

Usage

Run the Jupyter Notebook

Train and Save the Random Forest Model

Launch the Streamlit App

References

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages