SmartCart is a growing e-commerce platform serving customers across multiple countries. The company currently uses generic marketing strategies without understanding distinct customer behavior patterns.
This project builds an Intelligent Customer Segmentation System using Unsupervised Machine Learning to discover hidden behavioral patterns and group customers into meaningful clusters.
The objective is to enable:
- Personalized marketing
- High-value customer identification
- Churn-risk detection
- Data-driven business decisions
The dataset contains 2240 customer records with 22 features including:
- Year_Birth
- Education
- Marital_Status
- Income
- Kidhome
- Teenhome
- Dt_Customer
- MntWines
- MntFruits
- MntMeatProducts
- MntFishProducts
- MntSweetProducts
- MntGoldProds
- NumDealsPurchases
- NumWebPurchases
- NumCatalogPurchases
- NumStorePurchases
- NumWebVisitsMonth
- Recency
- Complain
- Python
- Pandas
- NumPy
- Matplotlib / Seaborn
- Scikit-learn
Clustering algorithms used:
- K-Means
- Hierarchical Clustering
- DBSCAN
- Handled missing values (Income)
- Converted enrollment date into customer tenure
- Feature scaling using StandardScaler
- Removed irrelevant identifiers
- Total spending calculation
- Total purchase frequency
- Customer tenure extraction
- Loyalty indicator features
- Applied K-Means clustering
- Used Elbow Method to determine optimal clusters
- Evaluated clusters using Silhouette Score
- Visualized clusters using PCA
The model identified distinct customer segments such as:
- High-Value Premium Customers
- Frequent Discount Shoppers
- High Engagement but Low Purchase Users
- Churn-Prone Customers
These segments can help SmartCart:
- Design targeted campaigns
- Improve retention strategies
- Optimize marketing budget allocation
By implementing this segmentation system, SmartCart can:
- Increase marketing ROI
- Reduce churn rate
- Improve customer lifetime value (CLV)
- Enable data-driven personalization
- Deploy as a web-based dashboard
- Integrate with real-time transaction data
- Apply RFM-based segmentation
- Combine with churn prediction model