|
| 1 | +# Configuration Management Strategy - SPIKE-802 |
| 2 | + |
| 3 | +## Current State Analysis |
| 4 | + |
| 5 | +### Identified Hardcoded Values |
| 6 | +- **Pagination limits**: Default 20, max 100 in workspace service |
| 7 | +- **Batch sizes**: Default 5 in LLM orchestrator |
| 8 | +- **Timeout values**: 5.0 seconds for HTTP clients |
| 9 | +- **Rate limits**: max_workspaces_per_user: 50, max_members_per_workspace: 100 |
| 10 | +- **Session TTL**: 3600 seconds (1 hour) |
| 11 | +- **Token expiration**: 7 days for invitations |
| 12 | +- **Field limits**: min_description_length: 20 |
| 13 | + |
| 14 | +### Current Configuration Patterns |
| 15 | +- **Python Services**: Using Pydantic BaseSettings with .env support |
| 16 | +- **Environment Variables**: Mixed approach with some hardcoded defaults |
| 17 | +- **Service Discovery**: Hardcoded service URLs in some places |
| 18 | + |
| 19 | +## Proposed Configuration Architecture |
| 20 | + |
| 21 | +### 1. Three-Tier Configuration System |
| 22 | + |
| 23 | +``` |
| 24 | +┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ |
| 25 | +│ Environment │ │ Service │ │ Runtime │ |
| 26 | +│ Variables │────│ Config Files │────│ Overrides │ |
| 27 | +│ (.env files) │ │ (YAML/JSON) │ │ (API/DB) │ |
| 28 | +└─────────────────┘ └──────────────────┘ └─────────────────┘ |
| 29 | +``` |
| 30 | + |
| 31 | +### 2. Configuration Schema Design |
| 32 | + |
| 33 | +#### Base Configuration Schema |
| 34 | +```yaml |
| 35 | +# config/base.yaml |
| 36 | +application: |
| 37 | + name: "${SERVICE_NAME}" |
| 38 | + version: "1.0.0" |
| 39 | + environment: "${ENVIRONMENT:development}" |
| 40 | + |
| 41 | +server: |
| 42 | + host: "${HOST:0.0.0.0}" |
| 43 | + port: "${PORT:8000}" |
| 44 | + timeout: "${SERVER_TIMEOUT:30}" |
| 45 | + |
| 46 | +database: |
| 47 | + url: "${DATABASE_URL}" |
| 48 | + pool_size: "${DB_POOL_SIZE:10}" |
| 49 | + timeout: "${DB_TIMEOUT:30}" |
| 50 | + |
| 51 | +redis: |
| 52 | + url: "${REDIS_URL}" |
| 53 | + timeout: "${REDIS_TIMEOUT:5}" |
| 54 | + |
| 55 | +pagination: |
| 56 | + default_limit: "${PAGINATION_DEFAULT:20}" |
| 57 | + max_limit: "${PAGINATION_MAX:100}" |
| 58 | + |
| 59 | +batch_processing: |
| 60 | + default_size: "${BATCH_SIZE:5}" |
| 61 | + max_size: "${BATCH_MAX_SIZE:50}" |
| 62 | + |
| 63 | +security: |
| 64 | + api_key_required: "${API_KEY_REQUIRED:true}" |
| 65 | + jwt_expiry: "${JWT_EXPIRY:24h}" |
| 66 | + session_ttl: "${SESSION_TTL:3600}" |
| 67 | + |
| 68 | +features: |
| 69 | + workspace_limit_per_user: "${MAX_WORKSPACES:50}" |
| 70 | + members_limit_per_workspace: "${MAX_MEMBERS:100}" |
| 71 | +``` |
| 72 | +
|
| 73 | +## File Structure Recommendations |
| 74 | +
|
| 75 | +``` |
| 76 | +project-root/ |
| 77 | +├── config/ |
| 78 | +│ ├── base.yaml # Base configuration |
| 79 | +│ ├── environments/ |
| 80 | +│ │ ├── development.yaml # Dev overrides |
| 81 | +│ │ ├── staging.yaml # Staging overrides |
| 82 | +│ │ └── production.yaml # Production overrides |
| 83 | +│ └── services/ |
| 84 | +│ ├── workspace-service.yaml # Service-specific config |
| 85 | +│ ├── llm-orchestrator.yaml |
| 86 | +│ └── airtable-gateway.yaml |
| 87 | +├── shared/ |
| 88 | +│ └── config/ |
| 89 | +│ ├── __init__.py |
| 90 | +│ ├── config_manager.py # Configuration loader |
| 91 | +│ ├── schemas.py # Pydantic models |
| 92 | +│ └── validation.py # Config validation |
| 93 | +└── .env.example # Template for local setup |
| 94 | +``` |
| 95 | + |
| 96 | +## Environment Variable vs Config File Decision Matrix |
| 97 | + |
| 98 | +| Use Case | Environment Variables | Config Files | |
| 99 | +|----------|---------------------|--------------| |
| 100 | +| Secrets (API keys, passwords) | ✅ Yes | ❌ No | |
| 101 | +| Service URLs | ✅ Yes | ✅ Yes (with env interpolation) | |
| 102 | +| Feature flags | ✅ Yes | ✅ Yes | |
| 103 | +| Business logic constants | ❌ No | ✅ Yes | |
| 104 | +| Development overrides | ✅ Yes (.env.local) | ✅ Yes | |
| 105 | +| Static application config | ❌ No | ✅ Yes | |
| 106 | + |
| 107 | +## Migration Plan with Priorities |
| 108 | + |
| 109 | +### Phase 1 (High Priority) - Foundation |
| 110 | +1. Create shared configuration management library |
| 111 | +2. Replace hardcoded pagination limits in workspace service |
| 112 | +3. Centralize service URLs and timeouts |
| 113 | +4. Implement config validation |
| 114 | + |
| 115 | +### Phase 2 (Medium Priority) - Service Migration |
| 116 | +1. Migrate workspace-service to new config system |
| 117 | +2. Update llm-orchestrator configuration |
| 118 | +3. Standardize batch processing configurations |
| 119 | +4. Implement environment-specific overrides |
| 120 | + |
| 121 | +### Phase 3 (Low Priority) - Advanced Features |
| 122 | +1. Add runtime configuration updates via API |
| 123 | +2. Implement configuration change auditing |
| 124 | +3. Add configuration hot-reloading |
| 125 | +4. Create configuration management dashboard |
| 126 | + |
| 127 | +## Example Implementation |
| 128 | + |
| 129 | +### Shared Configuration Manager |
| 130 | +```python |
| 131 | +# shared/config/config_manager.py |
| 132 | +from typing import Dict, Any, Optional |
| 133 | +from pathlib import Path |
| 134 | +import yaml |
| 135 | +import os |
| 136 | +from pydantic import BaseSettings, Field |
| 137 | +from functools import lru_cache |
| 138 | + |
| 139 | +class BaseAppConfig(BaseSettings): |
| 140 | + """Base configuration with common settings""" |
| 141 | + |
| 142 | + # Application |
| 143 | + service_name: str = Field(..., env="SERVICE_NAME") |
| 144 | + service_version: str = "1.0.0" |
| 145 | + environment: str = Field(default="development", env="ENVIRONMENT") |
| 146 | + |
| 147 | + # Server |
| 148 | + host: str = Field(default="0.0.0.0", env="HOST") |
| 149 | + port: int = Field(default=8000, env="PORT") |
| 150 | + server_timeout: int = Field(default=30, env="SERVER_TIMEOUT") |
| 151 | + |
| 152 | + # Database |
| 153 | + database_url: str = Field(..., env="DATABASE_URL") |
| 154 | + db_pool_size: int = Field(default=10, env="DB_POOL_SIZE") |
| 155 | + db_timeout: int = Field(default=30, env="DB_TIMEOUT") |
| 156 | + |
| 157 | + # Pagination |
| 158 | + pagination_default: int = Field(default=20, env="PAGINATION_DEFAULT") |
| 159 | + pagination_max: int = Field(default=100, env="PAGINATION_MAX") |
| 160 | + |
| 161 | + # Batch Processing |
| 162 | + batch_size: int = Field(default=5, env="BATCH_SIZE") |
| 163 | + batch_max_size: int = Field(default=50, env="BATCH_MAX_SIZE") |
| 164 | + |
| 165 | + class Config: |
| 166 | + env_file = ".env" |
| 167 | + case_sensitive = False |
| 168 | + |
| 169 | +class ConfigManager: |
| 170 | + """Centralized configuration management""" |
| 171 | + |
| 172 | + def __init__(self, config_dir: Path = Path("config")): |
| 173 | + self.config_dir = config_dir |
| 174 | + self._config_cache: Dict[str, Any] = {} |
| 175 | + |
| 176 | + def load_config(self, service_name: str) -> Dict[str, Any]: |
| 177 | + """Load configuration with proper precedence""" |
| 178 | + if service_name in self._config_cache: |
| 179 | + return self._config_cache[service_name] |
| 180 | + |
| 181 | + # 1. Load base config |
| 182 | + config = self._load_yaml(self.config_dir / "base.yaml") |
| 183 | + |
| 184 | + # 2. Load environment-specific overrides |
| 185 | + env = os.getenv("ENVIRONMENT", "development") |
| 186 | + env_config = self._load_yaml( |
| 187 | + self.config_dir / "environments" / f"{env}.yaml" |
| 188 | + ) |
| 189 | + if env_config: |
| 190 | + config = self._deep_merge(config, env_config) |
| 191 | + |
| 192 | + # 3. Load service-specific config |
| 193 | + service_config = self._load_yaml( |
| 194 | + self.config_dir / "services" / f"{service_name}.yaml" |
| 195 | + ) |
| 196 | + if service_config: |
| 197 | + config = self._deep_merge(config, service_config) |
| 198 | + |
| 199 | + # 4. Apply environment variable interpolation |
| 200 | + config = self._interpolate_env_vars(config) |
| 201 | + |
| 202 | + self._config_cache[service_name] = config |
| 203 | + return config |
| 204 | + |
| 205 | + def _load_yaml(self, path: Path) -> Optional[Dict[str, Any]]: |
| 206 | + """Load YAML file safely""" |
| 207 | + try: |
| 208 | + if path.exists(): |
| 209 | + with open(path, 'r') as f: |
| 210 | + return yaml.safe_load(f) or {} |
| 211 | + except Exception as e: |
| 212 | + print(f"Warning: Failed to load {path}: {e}") |
| 213 | + return {} |
| 214 | + |
| 215 | +@lru_cache() |
| 216 | +def get_config_manager() -> ConfigManager: |
| 217 | + """Get cached configuration manager""" |
| 218 | + return ConfigManager() |
| 219 | +``` |
| 220 | + |
| 221 | +### Service-Specific Configuration |
| 222 | +```python |
| 223 | +# workspace-service/src/config.py |
| 224 | +from shared.config.config_manager import BaseAppConfig, get_config_manager |
| 225 | +from pydantic import Field |
| 226 | + |
| 227 | +class WorkspaceConfig(BaseAppConfig): |
| 228 | + """Workspace service configuration""" |
| 229 | + |
| 230 | + # Workspace-specific settings |
| 231 | + max_workspaces_per_user: int = Field(default=50, env="MAX_WORKSPACES") |
| 232 | + max_members_per_workspace: int = Field(default=100, env="MAX_MEMBERS") |
| 233 | + default_workspace_template: str = Field(default="blank", env="DEFAULT_TEMPLATE") |
| 234 | + invitation_expiry_days: int = Field(default=7, env="INVITATION_EXPIRY") |
| 235 | + |
| 236 | + @classmethod |
| 237 | + def load(cls) -> "WorkspaceConfig": |
| 238 | + """Load configuration with file-based overrides""" |
| 239 | + config_manager = get_config_manager() |
| 240 | + file_config = config_manager.load_config("workspace-service") |
| 241 | + |
| 242 | + # Create instance with file config as defaults |
| 243 | + return cls(**file_config) |
| 244 | + |
| 245 | +@lru_cache() |
| 246 | +def get_workspace_config() -> WorkspaceConfig: |
| 247 | + """Get cached workspace configuration""" |
| 248 | + return WorkspaceConfig.load() |
| 249 | +``` |
| 250 | + |
| 251 | +## Benefits of This Approach |
| 252 | + |
| 253 | +1. **Centralized Management**: Single source of truth for all configurations |
| 254 | +2. **Environment Flexibility**: Easy switching between dev/staging/prod |
| 255 | +3. **Type Safety**: Pydantic validation ensures configuration correctness |
| 256 | +4. **Security**: Clear separation of secrets vs non-sensitive config |
| 257 | +5. **Maintainability**: Eliminates hardcoded values across services |
| 258 | +6. **Scalability**: Supports complex multi-service configurations |
| 259 | + |
| 260 | +## Implementation Timeline |
| 261 | + |
| 262 | +- **Week 1**: Create shared configuration library and base schemas |
| 263 | +- **Week 2**: Migrate workspace-service as proof of concept |
| 264 | +- **Week 3**: Update remaining Python services |
| 265 | +- **Week 4**: Add environment-specific configurations and testing |
| 266 | + |
| 267 | +Total estimated effort: ~20-25 development hours across 4 weeks. |
0 commit comments