**A com## 🐳 Docker Quick Star## 🐳 Docker Quick Start
Choose your preferred R integration level:
# Full Python + R environment with RStudio Server
docker-compose up -d
# Access interfaces:
open http://localhost:8888 # Jupyter Lab (Python + R kernels)
open http://localhost:8787 # RStudio Server (Pure R)# Python + R in Jupyter Lab only (faster build)
docker-compose -f docker-compose.simple.yml up -d
# Access Jupyter with both kernels:
open http://localhost:8888 # Jupyter Lab (Python + R kernels)🔧 Complete Docker Guide: See DOCKER.md for detailed setup, troubleshooting, and deployment. approach for consistent, multi-language development environment.**
# 1. Start all services (Jupyter Lab + RStudio + PostgreSQL + Redis)
docker-compose up -d
# 2. A## 💼 Domain-Specific Examples
### 🏦 Finance Analytics (`examples/finance/` + `examples/r_analytics/`)
Comprehensive financial data analysis in **both Python and R**:
- **Portfolio Analysis**: Risk assessment, performance metrics, Sharpe/Sortino ratios
- **Trading Analytics**: Technical indicators, backtesting, market analysis
- **Risk Management**: VaR, stress testing, Monte Carlo simulations
- **R Finance**: Advanced statistical modeling with quantmod and PerformanceAnalyticsr preferred interface
open http://localhost:8888 # Jupyter Lab (Python + R kernels)
open http://localhost:8787 # RStudio Server (Pure R)
# 3. View service status
./docker/docker-utils.sh status🔧 Complete Docker Guide: See DOCKER.md for detailed setup, troubleshooting, and deployment.uction-ready data analysis environment with Python + R integration**
This template provides a complete multi-language data analysis environment with Python and R integration, designed for data scientists, analysts, and researchers who want to leverage the best of both statistical ecosystems.
- � Python + 📊 R Integration: Seamless data exchange between Python and R
- 🐳 Docker Environment: Python 3.12 + R 4.3 + RStudio Server + Jupyter Lab
- 📈 Domain Examples: Finance and marketing analytics with real-world applications
- � Statistical Power: Advanced statistical modeling and hypothesis testing
- 🎨 Rich Visualizations: Interactive plots with plotly, ggplot2, and matplotlib
- 🗄️ Database Ready: PostgreSQL and Redis integration
- 🧪 Testing Framework: Pytest with coverage reporting
- 📚 Documentation: Comprehensive guides and examples
| Service | URL | Purpose |
|---|---|---|
| Jupyter Lab | http://localhost:8888 | Python + R notebooks |
| RStudio Server | http://localhost:8787 | Pure R development |
| PostgreSQL | localhost:5432 | Data storage |
| Redis | localhost:6379 | Caching |
Login for RStudio: Username: analyst, Password: analysta Analysis Project Template
A comprehensive template for data science and analytics projects using Python.
Recommended approach for consistent, reproducible development environment.
# 1. Start all services (Jupyter Lab + PostgreSQL + Redis)
docker-compose up -d
# 2. Access Jupyter Lab
open http://localhost:8888
# 3. View service status
./docker/docker-utils.sh status� Complete Docker Guide: See DOCKER.md for detailed Docker setup, troubleshooting, and deployment instructions.
data_analysis/
├── .github/ # GitHub configurations
│ └── copilot-instructions.md
├── src/ # Source code modules
│ ├── __init__.py
│ ├── data_processing.py # Data cleaning and preprocessing
│ ├── visualization.py # Plotting and visualization utilities
│ ├── analysis.py # Analysis functions
│ └── utils.py # Utility functions
├── notebooks/ # Jupyter notebooks
│ ├── 01_data_exploration.ipynb
│ ├── 02_data_cleaning.ipynb
│ ├── 03_analysis.ipynb
│ └── 04_modeling.ipynb
├── data/ # Data storage
│ ├── raw/ # Original, immutable data
│ ├── processed/ # Cleaned and processed data
│ └── external/ # External datasets
├── tests/ # Test files
│ ├── __init__.py
│ ├── test_data_processing.py
│ ├── test_visualization.py
│ └── test_analysis.py
├── docs/ # Documentation
│ ├── data_dictionary.md # Data field descriptions
│ ├── methodology.md # Analysis methodology
│ └── results.md # Results and findings
├── configs/ # Configuration files
│ ├── config.yaml # Main configuration
│ └── logging.yaml # Logging configuration
├── outputs/ # Generated outputs
│ ├── figures/ # Plots and visualizations
│ └── models/ # Trained models
├── requirements.txt # Python dependencies
├── .env.example # Environment variables template
├── .gitignore # Git ignore rules
└── README.md # This file
This template provides seamless integration between Python and R, allowing you to leverage the best of both languages:
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
# Enable automatic pandas-R dataframe conversion
pandas2ri.activate()
# Execute R code from Python
ro.r('''
library(ggplot2)
library(dplyr)
# Perform statistical analysis in R
model <- lm(mpg ~ wt + hp, data = mtcars)
summary(model)
''')library(reticulate)
# Use Python libraries in R
py_run_string("
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
# Machine learning in Python
model = RandomForestClassifier()
")
# Access Python objects in R
py$model- Python: Machine learning (scikit-learn), deep learning, web scraping
- R: Advanced statistics, specialized packages (ggplot2, dplyr, tidyverse)
- Shared: Data frames, visualizations, model results
📚 Examples: See examples/r_analytics/python_r_integration.R for comprehensive examples.
- Start with
notebooks/01_data_exploration.ipynb - Understand data structure, quality, and patterns
- Document findings in
docs/data_dictionary.md
- Use
notebooks/02_data_cleaning.ipynb - Implement reusable functions in
src/data_processing.py - Save processed data to
data/processed/
- Conduct analysis in
notebooks/03_analysis.ipynb - Build models in
notebooks/04_modeling.ipynb - Create visualizations using
src/visualization.py
- Update methodology in
docs/methodology.md - Document results in
docs/results.md - Keep notebooks clean and well-commented
- Modular Architecture: Reusable code in
src/modules - Jupyter Integration: Ready-to-use notebooks for analysis
- Data Management: Organized data storage structure
- Testing Framework: Unit tests for data processing functions
- Configuration Management: YAML-based configuration
- Documentation: Structured documentation templates
- Version Control: Git-friendly with proper .gitignore
Core libraries included:
- pandas - Data manipulation and analysis
- numpy - Numerical computing
- matplotlib & seaborn - Data visualization
- scikit-learn - Machine learning
- jupyter - Interactive notebooks
- pytest - Testing framework
- pyyaml - Configuration management
- Environment variables for sensitive data
- Data privacy considerations
- Reproducible analysis pipelines
- Code quality standards
- Documentation requirements
# Import project modules
from src.data_processing import load_data, clean_data
from src.visualization import create_scatter_plot
from src.analysis import calculate_correlation
# Load and process data
raw_data = load_data('data/raw/dataset.csv')
clean_data = clean_data(raw_data)
# Create visualizations
create_scatter_plot(clean_data, 'x_column', 'y_column')
# Perform analysis
correlation = calculate_correlation(clean_data)- Follow PEP 8 coding standards
- Add tests for new functions
- Update documentation
- Use meaningful commit messages
This template is open source and available under the MIT License.
Comprehensive financial data analysis examples including:
- Portfolio Analysis: Risk assessment, performance metrics, Sharpe/Sortino ratios
- Trading Analytics: Technical indicators, backtesting, market analysis
- Risk Management: VaR, stress testing, Monte Carlo simulations
- Financial Utilities: Complete finance calculation library
Key Notebooks:
01_portfolio_analysis.ipynb- Complete portfolio performance evaluation- Interactive risk-return visualizations and drawdown analysis
- Technical indicators: MACD, RSI, Bollinger Bands
- Monte Carlo simulations for scenario analysis
Advanced marketing and customer analytics in both Python and R:
- Customer Segmentation: RFM analysis, behavioral clustering (Python + R)
- Campaign Analysis: A/B testing, attribution modeling, ROI analysis
- Customer Lifetime Value: Predictive CLV modeling and optimization
- R Marketing: Advanced statistical testing and customer analytics with tidyverse
- Digital Analytics: Conversion funnel, engagement metrics
Key Notebooks:
01_customer_segmentation.ipynb- Automated RFM customer segmentation- Interactive 3D visualization of customer segments
- Statistical A/B testing with significance analysis
- Marketing attribution across multiple touchpoints
# Python examples
cd examples/finance/ && jupyter lab 01_portfolio_analysis.ipynb
cd examples/marketing/ && jupyter lab 01_customer_segmentation.ipynb
# R examples (choose your interface)
cd examples/r_analytics/
# Option 1: RStudio Server (Pure R)
open http://localhost:8787
# Option 2: Jupyter with R kernel
jupyter lab 01_statistical_analysis.ipynb
# Option 3: Run R scripts directly
Rscript financial_analysis.RPure R statistical analysis and advanced modeling:
- Statistical Analysis: Comprehensive hypothesis testing and regression modeling
- R-Python Integration: Seamless data exchange between languages
- Interactive Notebooks: R kernels in Jupyter Lab with rich visualizations
- RStudio Integration: Pure R development environment
Key Files:
financial_analysis.R- Portfolio optimization and risk analyticsmarketing_analysis.R- Customer segmentation and A/B testingpython_r_integration.R- Cross-language data pipelines01_statistical_analysis.ipynb- R statistical modeling in Jupyter
Each example includes sample data, specialized utility functions, and production-ready analysis workflows.
Happy analyzing! 📈