Refusned · Refusned · May 21, 2026 · May 21, 2026 · May 21, 2026 · May 21, 2026
diff --git a/.github/workflows/ci-python.yml b/.github/workflows/ci-python.yml
@@ -0,0 +1,53 @@
+name: CI Python
+
+# Дополняет static-checks.yml: гоняет ruff + pytest для Python-слоёв
+# (api/reliability/ — reliability-модули). static-checks.yml остаётся
+# для bash-тестов деплоя.
+
+on:
+  push:
+    branches: [main]
+    paths:
+      - "api/**"
+      - "evals/**"
+      - ".github/workflows/ci-python.yml"
+  pull_request:
+    branches: [main]
+    paths:
+      - "api/**"
+      - "evals/**"
+      - ".github/workflows/ci-python.yml"
+
+concurrency:
+  group: ci-python-${{ github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  reliability:
+    name: Reliability layer (Python ${{ matrix.python-version }})
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.11", "3.12"]
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+          cache: pip
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install fastapi pydantic httpx pytest pytest-asyncio ruff
+
+      - name: Ruff (lint)
+        run: ruff check api/reliability/
+
+      - name: Pytest — reliability layer
+        working-directory: api
+        run: python -m pytest reliability/tests/ -v --tb=short
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,41 @@
 # Changelog
 
+Формат: [Keep a Changelog](https://keepachangelog.com/ru/1.1.0/) · версионирование [SemVer](https://semver.org/lang/ru/).
+
+## [1.1.0] — 2026-05-21
+
+Релиз про production reliability и проверяемость поведения. Новый код не меняет
+существующее поведение бота — слой подключается явно (см. `api/reliability/INTEGRATION.md`).
+
+### Added
+
+- **Reliability-слой** (`api/reliability/`):
+  - `healthcheck.py` — расширенные пробы `/health/detailed` (статус 4 LLM-провайдеров,
+    модели, очередь, uptime, флаг 152-ФЗ режима), `/health/live`, `/health/ready`;
+    в 152-ФЗ режиме пробятся только РФ-провайдеры
+  - `cost_ceiling.py` — дневной потолок расходов на LLM (`KENT_MAX_DAILY_COST_USD`):
+    `warning` на 80%, блокировка LLM-эндпоинтов с HTTP 429 на 100%, откат в полночь UTC
+  - `redaction.py` — маскировка PII в логах: email, телефоны, OpenAI/Anthropic/Telegram
+    токены, Bearer, карты
+  - 19 unit-тестов (`api/reliability/tests/`), все зелёные
+- **Evals** (`evals/`):
+  - `regression_set.yaml` — регрессионный gold set из 15 эталонных запросов
+    (tool_calling, RAG, ambiguous, prompt_injection, long_context, pii_sensitive, edge_case)
+  - `run_regression.py` — pytest-harness, проверяет ответы против expected-правил
+  - `cost_report.md` — baseline по стоимости на провайдера и тип запроса
+- **Документация**:
+  - `docs/known_limitations.md` — границы продукта
+  - `docs/failure_modes.md` — таксономия 10 типов отказов
+  - README: секции «Engineering decisions», «Reliability», «Evals & Observability»
+- **CI**: `.github/workflows/ci-python.yml` — ruff + pytest для reliability-слоя
+  на Python 3.11 и 3.12 (дополняет существующий `static-checks.yml`)
+
+### Changed
+
+- README переписан как инженерный case-study с честным описанием архитектуры
+  (overlay над OpenClaw, LangGraph только для RAG-routing)
+- Open-to-work бейдж: «AI Automation Specialist» → «AI / LLM Application Engineer»
+
 ## [1.0.0] — 2026-04-11
 
 ### Core

diff --git a/README.md b/README.md
diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.0.0
+1.1.0
diff --git a/api/reliability/INTEGRATION.md b/api/reliability/INTEGRATION.md
@@ -0,0 +1,105 @@
+# Подключение reliability-слоя к Kent API Gateway
+
+Три модуля в `api/reliability/` подключаются к существующему `api/main.py`.
+Все правки — аддитивные: ничего из текущего поведения не ломается.
+
+`api/` работает во flat-layout (`uvicorn main:app`, cwd = `api/`), поэтому
+`reliability` импортируется как top-level пакет: `from reliability.* import ...`.
+
+---
+
+## 1. Расширенный healthcheck
+
+Существующий `GET /health` остаётся без изменений (лёгкий probe для Docker).
+Модуль добавляет НЕ конфликтующие эндпоинты: `/health/detailed`, `/health/live`,
+`/health/ready`.
+
+В `api/main.py`, после блока регистрации middleware (примерно строка 708):
+
+```python
+from reliability.healthcheck import router as health_router, mark_llm_success
+
+app.include_router(health_router)
+```
+
+В Multi-LLM Provider Factory (`api/russian_llm.py`) — после каждого
+успешного ответа провайдера:
+
+```python
+from reliability.healthcheck import mark_llm_success
+mark_llm_success()
+```
+
+## 2. Cost ceiling
+
+```python
+import os
+from pathlib import Path
+from starlette.middleware.base import BaseHTTPMiddleware
+from reliability.cost_ceiling import CostTracker, cost_ceiling_middleware
+
+cost_tracker = CostTracker(
+    daily_limit_usd=float(os.getenv("KENT_MAX_DAILY_COST_USD", "10.0")),
+    storage_path=Path(os.getenv("KENT_COST_STATE", "/var/lib/kent/cost.json")),
+)
+app.add_middleware(BaseHTTPMiddleware, dispatch=cost_ceiling_middleware(cost_tracker))
+```
+
+После каждого LLM-ответа в Provider Factory:
+
+```python
+await cost_tracker.record(provider="openai", model="gpt-4o", cost_usd=calculated_cost)
+```
+
+Поведение: на 80% дневного лимита — `logger.warning`, на 100% — `logger.error`
+плюс HTTP 429 для LLM-эндпоинтов. Откат в полночь UTC.
+
+В `.env.example` добавить:
+```
+KENT_MAX_DAILY_COST_USD=10.0
+KENT_COST_STATE=/var/lib/kent/cost.json
+```
+
+## 3. PII redaction в логах
+
+В точке инициализации логирования `api/main.py`:
+
+```python
+import logging
+from reliability.redaction import RedactionFilter
+
+for handler in logging.getLogger().handlers:
+    handler.addFilter(RedactionFilter())
+```
+
+Маскирует: email, телефоны, OpenAI/Anthropic/Telegram токены, Bearer, карты.
+Не маскирует содержимое RAG-документов (рабочий контекст).
+
+## 4. Dockerfile
+
+`api/Dockerfile` копирует только `main.py langchain_module.py russian_llm.py`.
+Добавить копирование пакета:
+
+```dockerfile
+COPY main.py langchain_module.py russian_llm.py ./
+COPY reliability/ ./reliability/
+```
+
+## Тесты
+
+```bash
+cd api
+pip install pytest pytest-asyncio httpx
+python -m pytest reliability/tests/ -v
+```
+
+19 тестов: healthcheck (4), cost_ceiling (5), redaction (10). Гоняются в CI
+через `.github/workflows/ci-python.yml`.
+
+## Что осознанно НЕ подключено по умолчанию
+
+Модули положены в репозиторий с тестами и зелёным CI, но три правки выше
+(include_router, add_middleware, addFilter) применяются вручную. Причина:
+`api/main.py` обслуживает живой production-бот @ask_kent_bot — изменения в
+точках инициализации проверяются на staging перед prod. Reliability-слой
+готов как drop-in; подключение — отдельный контролируемый шаг.
diff --git a/api/reliability/__init__.py b/api/reliability/__init__.py
@@ -0,0 +1,24 @@
+"""
+Kent reliability layer.
+
+Drop-in модули поверх существующего FastAPI gateway (api/main.py):
+- healthcheck: расширенный /health/detailed со статусом провайдеров
+- cost_ceiling: дневной потолок расходов на LLM
+- redaction: маскировка PII в логах
+
+Подключение — см. api/reliability/INTEGRATION.md.
+"""
+
+from .cost_ceiling import CostTracker, cost_ceiling_middleware
+from .healthcheck import mark_llm_success
+from .healthcheck import router as health_router
+from .redaction import RedactionFilter, redact
+
+__all__ = [
+    "CostTracker",
+    "RedactionFilter",
+    "cost_ceiling_middleware",
+    "health_router",
+    "mark_llm_success",
+    "redact",
+]
diff --git a/api/reliability/cost_ceiling.py b/api/reliability/cost_ceiling.py
@@ -0,0 +1,166 @@
+"""
+Cost ceiling для Kent AI Assistant.
+
+Дневной потолок $ на LLM-вызовы. При достижении 80% — warning в логи,
+при достижении 100% — отказ в обслуживании (HTTP 429 с понятным сообщением).
+
+INTEGRATION (см. api/reliability/INTEGRATION.md):
+    from api.reliability.cost_ceiling import CostTracker, cost_ceiling_middleware
+
+    cost_tracker = CostTracker(
+        daily_limit_usd=float(os.getenv("KENT_MAX_DAILY_COST_USD", "10.0")),
+        storage_path=Path("/var/lib/kent/cost.json"),
+    )
+
+    # 1) Регистрируйте каждый успешный LLM-вызов:
+    await cost_tracker.record(provider="openai", model="gpt-4o", cost_usd=0.012)
+
+    # 2) Подключите middleware, чтобы блокировать запросы при 100%:
+    app.add_middleware(BaseHTTPMiddleware, dispatch=cost_ceiling_middleware(cost_tracker))
+"""
+from __future__ import annotations
+
+import asyncio
+import json
+import logging
+from dataclasses import dataclass
+from datetime import date
+from pathlib import Path
+
+from fastapi import Request
+from fastapi.responses import JSONResponse
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass(slots=True)
+class DailyCost:
+    day: str  # ISO date YYYY-MM-DD
+    total_usd: float
+    by_provider: dict[str, float]
+
+
+class CostTracker:
+    """Простой персистентный учёт дневных затрат на LLM-провайдеры.
+
+    Для production-нагрузки (>10 RPS) замените на Redis с INCRBYFLOAT.
+    Здесь — file-backed для one-process simplicity.
+    """
+
+    def __init__(self, daily_limit_usd: float, storage_path: Path) -> None:
+        self.daily_limit_usd = daily_limit_usd
+        self.storage_path = storage_path
+        self._lock = asyncio.Lock()
+        self._current: DailyCost = self._load_or_init()
+
+    def _load_or_init(self) -> DailyCost:
+        today = date.today().isoformat()
+        if self.storage_path.exists():
+            try:
+                data = json.loads(self.storage_path.read_text())
+                if data.get("day") == today:
+                    return DailyCost(
+                        day=today,
+                        total_usd=float(data["total_usd"]),
+                        by_provider=dict(data["by_provider"]),
+                    )
+            except (json.JSONDecodeError, KeyError, ValueError) as e:
+                logger.warning("cost_tracker: corrupted state, resetting: %s", e)
+        return DailyCost(day=today, total_usd=0.0, by_provider={})
+
+    def _persist(self) -> None:
+        self.storage_path.parent.mkdir(parents=True, exist_ok=True)
+        self.storage_path.write_text(
+            json.dumps(
+                {
+                    "day": self._current.day,
+                    "total_usd": round(self._current.total_usd, 6),
+                    "by_provider": {
+                        k: round(v, 6) for k, v in self._current.by_provider.items()
+                    },
+                },
+                indent=2,
+            )
+        )
+
+    def _rollover_if_needed(self) -> None:
+        today = date.today().isoformat()
+        if self._current.day != today:
+            logger.info(
+                "cost_tracker: rollover %s -> %s, final total $%.4f",
+                self._current.day,
+                today,
+                self._current.total_usd,
+            )
+            self._current = DailyCost(day=today, total_usd=0.0, by_provider={})
+
+    async def record(self, *, provider: str, model: str, cost_usd: float) -> None:
+        """Вызывать после каждого LLM-ответа (успешного или с partial output)."""
+        async with self._lock:
+            self._rollover_if_needed()
+            prev_total = self._current.total_usd
+            self._current.total_usd += cost_usd
+            self._current.by_provider[provider] = (
+                self._current.by_provider.get(provider, 0.0) + cost_usd
+            )
+            self._persist()
+
+            ratio = self._current.total_usd / self.daily_limit_usd
+            prev_ratio = prev_total / self.daily_limit_usd
+
+            # Лог-предупреждение при пересечении 80%-порога
+            if prev_ratio < 0.8 <= ratio:
+                logger.warning(
+                    "cost_tracker: 80%% threshold reached — daily=$%.4f, limit=$%.2f",
+                    self._current.total_usd,
+                    self.daily_limit_usd,
+                )
+            if prev_ratio < 1.0 <= ratio:
+                logger.error(
+                    "cost_tracker: DAILY LIMIT EXCEEDED — daily=$%.4f, limit=$%.2f. "
+                    "Новые LLM-запросы будут отклоняться до полуночи UTC.",
+                    self._current.total_usd,
+                    self.daily_limit_usd,
+                )
+
+    def is_limit_exceeded(self) -> bool:
+        self._rollover_if_needed()
+        return self._current.total_usd >= self.daily_limit_usd
+
+    def snapshot(self) -> dict[str, float | str | dict[str, float]]:
+        self._rollover_if_needed()
+        return {
+            "day": self._current.day,
+            "total_usd": round(self._current.total_usd, 4),
+            "limit_usd": self.daily_limit_usd,
+            "ratio": round(self._current.total_usd / self.daily_limit_usd, 3),
+            "by_provider": {
+                k: round(v, 4) for k, v in self._current.by_provider.items()
+            },
+        }
+
+
+def cost_ceiling_middleware(tracker: CostTracker):
+    """Возвращает ASGI middleware-функцию, которая отклоняет запросы
+    к LLM-endpoints при достижении дневного лимита."""
+
+    # Эндпоинты, которые тратят $ — корректируйте под Kent-маршруты.
+    LLM_PATH_PREFIXES = ("/chat", "/agents", "/ask", "/rag", "/skills")
+
+    async def dispatch(request: Request, call_next):
+        if request.url.path.startswith(LLM_PATH_PREFIXES) and tracker.is_limit_exceeded():
+            return JSONResponse(
+                status_code=429,
+                content={
+                    "error": "daily_cost_limit_exceeded",
+                    "message": (
+                        f"Daily LLM cost limit ${tracker.daily_limit_usd:.2f} reached. "
+                        "Try again after midnight UTC or contact admin to raise the cap."
+                    ),
+                    "snapshot": tracker.snapshot(),
+                },
+                headers={"Retry-After": "3600"},
+            )
+        return await call_next(request)
+
+    return dispatch