A full-stack economic intelligence system — integrating 7 World Bank macroeconomic datasets, building ML regression models to predict GDP, and delivering a 4-page interactive Power BI dashboard for global and country-level economic analysis.
- 🗺️ World map — GDP by country (filled choropleth)
- 📈 GDP trend lines — Top economies 2000–2024
- 🔢 KPI cards — Total global GDP, world population, country count
- 🎛️ Year range slicer — filter all visuals dynamically
- 📉 Exports vs Imports scatter chart
- 🏦 Top countries by FDI Inflows (bar chart)
- 📊 Inflation trends over time (line chart)
- 🌐 Country slicer for filtered analysis
- 🎯 Feature importance ranking (Random Forest)
- 🔵 Actual vs Predicted GDP — Random Forest
- 🔴 Actual vs Predicted GDP — Decision Tree
- 📊 R² score comparison chart
- 🔍 Country selector with full KPI panel
- 📈 GDP growth trajectory over time
- 📋 Detailed data table with all indicators
7 World Bank Excel Files (GDP, Exports, Imports, FDI, Inflation, Population, Govt Expenditure)
↓
Python Pipeline (pandas) — Melt wide→long, merge on Country+Year, clean missing values
↓
EDA — Correlation heatmap, scatter plots, distribution analysis, top economy trends
↓
Feature Selection — Random Forest importance ranking
↓
ML Models — Decision Tree & Random Forest Regressor (80/20 split)
↓
Evaluation — MAE, RMSE, R² | Residual analysis
↓
Export — PowerBI_GDP_Main.xlsx + Predictions + Feature Importance
↓
Power BI Dashboard — 4 interactive pages
- Source: World Bank Open Data
- Coverage: 180+ countries | 2000–2024
- Indicators integrated:
| File | Indicator | Role |
|---|---|---|
| GDP.xls | GDP (current US$) | Target variable |
| Export.xls | Exports of goods & services | Feature |
| Imports.xls | Imports of goods & services | Feature |
| FDI inflows.xls | Foreign Direct Investment net inflows | Feature |
| Inflation.xls | Consumer Price Index | Feature |
| Population.xls | Total population | Feature |
| Goverment_Expenditure.xls | General government expenditure | Feature |
| Metric | Decision Tree | Random Forest |
|---|---|---|
| MAE | Higher | ✅ Lower |
| RMSE | Higher | ✅ Lower |
| R² Score | Good | ✅ Better |
| Winner | ✅ Random Forest |
Key finding: Population and Government Expenditure rank as the top GDP predictors globally, with ensemble methods (Random Forest) significantly outperforming single-tree models by reducing variance across the high-dimensional cross-country dataset.
# Clone
git clone https://github.com/Derio001/exploratory-predictive-gdp-analysis.git
cd exploratory-predictive-gdp-analysis
# Install dependencies
pip install pandas numpy matplotlib seaborn scikit-learn openpyxl xlrd
# Run the notebook
jupyter notebook Project_Implementation.ipynb- Download Power BI Desktop (free)
- Open
powerbi_implementation.pbix - If prompted, re-link the data source to your local
PowerBI_GDP_Main.xlsx
exploratory-predictive-gdp-analysis/
│
├── Project_Implementation.ipynb # Full Python pipeline (78 cells)
├── powerbi_implementation.pbix # 4-page Power BI dashboard
│
├── data/
│ ├── Global GDP and Macroeconomic Indicators Dataset/
│ │ ├── GDP.xls
│ │ ├── Export.xls
│ │ ├── Imports.xls
│ │ ├── FDI inflows.xls
│ │ ├── Inflation.xls
│ │ ├── Population.xls
│ │ └── Goverment_Expenditure.xls
│ │
│ ├── Integrated_GDP_Dataset.csv # Cleaned unified dataset
│ ├── PowerBI_GDP_Main.xlsx # Main Power BI data source
│ ├── PowerBI_Predictions.csv # Actual vs predicted GDP
│ └── PowerBI_Feature_Importance.csv # Feature rankings
│
└── README.md
- Add Sub-Saharan Africa focused lens (Chad, Niger, Mali, Sudan deep dive)
- Incorporate conflict/fragility index as a feature for Sahel economies
- Time-series forecasting (ARIMA / LSTM) for multi-year GDP projection
- Live World Bank API integration for automatic data refresh
- Streamlit web app version for browser-based access
Mahamat Hanga Derio M.Tech Data Science — Christ University, Bangalore Chadian national | Building economic intelligence tools for Sub-Saharan Africa
📬 Open to collaboration with development banks, economic research institutions & policy organizations 🔗 GitHub | LRI Child Health Project →
Part of a portfolio focused on data-driven economic and public health analysis for the Lake Chad Basin and Sub-Saharan Africa.