Skip to content

release(v0.15.1): Polaris — WR diagnostics + per-user accuracy + reset hygiene#44

Open
imzzaidd wants to merge 41 commits into
mainfrom
feat/polaris-wr-diagnostics
Open

release(v0.15.1): Polaris — WR diagnostics + per-user accuracy + reset hygiene#44
imzzaidd wants to merge 41 commits into
mainfrom
feat/polaris-wr-diagnostics

Conversation

@imzzaidd
Copy link
Copy Markdown
Contributor

@imzzaidd imzzaidd commented May 23, 2026

Release v0.15.1 "Polaris"

Patch release on top of v0.15.0 "Orion". 11 commits covering the
WR diagnostics layer, per-user accuracy correctness, and the
reset / ghost-DM hygiene cluster surfaced during live operator
testing on 2026-05-21/22.

The release builds toward the operator's standing target of 70%+
tracked decisive WR by raising the emission bar, hard-skipping
unsafe-cycle calls, accelerating warmup exits, and giving the
operator runtime visibility into which signals are dragging WR
down.


A — Polaris diagnostics layer

A complete per-signal WR observability + correction loop. Every
signal now has a measured decisive WR, the engine can invert or
dampen a signal whose sign is flipped, and the operator can pin
overrides at runtime.

  • A1signal_outcomes SQLite table: per-prediction per-
    signal vote + outcome rows written by the resolver. Source of
    truth for all per-signal accuracy queries.
  • A2 — Signal auditor: rolling WR-driven classifier per
    signal (INVERT / DAMPEN / WATCH / KEEP) over configurable
    window + sample threshold.
  • A3 — Signal-polarity registry: runtime sign + cap
    overrides applied before composite assembly. Supports both
    auto (auditor-driven) and manual (operator-pinned) modes.
    Persisted to signal_polarity.
  • A4/wr-diagnose Telegram command: per-signal WR
    breakdown with inversion / dampen / watch sections + active
    overrides.
  • A5/signal-override Telegram command: operator-pinned
    polarity that wins over the auto auditor.
  • A6pnpm wr:audit CLI: exits non-zero when any active
    signal sits below the INVERT threshold so CI can gate releases
    on per-signal calibration health.

B — Per-user accuracy correctness

Closes a long-standing visibility bug where every operator on the
same Telegram bot saw a shared WR aggregated across all users.

  • B1user_id column is now persisted at logPrediction
    insert time (the read paths already filtered correctly — the
    bug was that the write path never populated the column).
  • B2engine.predict() gains an options.userId that
    flows through to every logPrediction call site (forecast,
    directional with TP/SL, forecast resolve).
  • B3 — All Telegram and AI tool-call sites that originate
    from an authenticated user thread userId through:
    /predict, /diagnose, /polymarket, get_prediction,
    predict_price, polymarket_edge, bulk-ladder.
  • B4 — New regression test covers the end-to-end user-scoped
    logging path.

C — Predictiveness signals (leading-indicator alpha)

Engine now anticipates moves via leading flow indicators instead
of reacting to lagging structure. Operator standing rule:
predictions must anticipate big moves, not chase them.

  • C1leadingFlow signal aggregator: taker momentum +
    orderbook imbalance + whale flow + CVD slope + funding
    inflection. Injected with 30% weight at the front of the
    on-chain composite via sigCfg.onChain.leadingFlowWeight.
  • C2 — CVD-gated SMC vote: when SMC reads bullish demand but
    CVD's recent slope is bearish (demand is being eaten),
    neutralize the structural vote instead of letting it outvote
    real-time order-flow truth.
  • C3 — 8 reversal FOL rules:
    failed_support_break, failed_resistance_rejection,
    asian_range_continuation_bull/bear,
    funding_peak_reversal_bearish,
    funding_trough_reversal_bullish,
    sweep_and_reverse_bullish_trap,
    sweep_and_reverse_bearish_trap.
  • C4 — 4 structural FOL rules:
    stable_supply_expanding_bullish /
    stable_supply_contracting_bearish,
    alt_rotation_confirmed_alt_bullish,
    btc_rotation_dominant_btc_bullish (TOTAL2/TOTAL3 dominance
    algebra).
  • C5 — 2 volatility FOL rules:
    vol_term_inverted_break_imminent,
    vol_acceleration_post_consolidation (realized vol + term
    structure).
  • C6 — Realized-volatility signal: short (12h) vs long (48h)
    realized σ + acceleration, pure compute on existing OHLCV.
  • C7MarketContext gains spotPrice,
    stablecoinNetExpansionUsd, altDominancePct,
    nonBtcEthDominancePct, realizedVol1h/4h, volTermSlope,
    volAcceleration.

D — Band-edge deformation (post-forecast)

Two post-processors run after computePredictionForecast closes,
mutating a shallow copy of the forecast so the band absorbs
predictable wick patterns the prior grader counted as MISS.

  • D1applyLiquidationMagnet: when a significant long /
    short liquidation cluster sits beyond the band edge on the
    matching side, stretch the edge to (cluster ± small buffer)
    capped at 2× the original half-width. Symbol-agnostic — runs
    off liquidationProximity data the engine already gathers.
  • D2applyMomentumBandWidth: when taker momentum + OB
    imbalance derivative + CVD short slope stack one way with
    magnitude ≥ 0.25, widen the band on that side by up to 1.8×
    while modestly widening the other edge.

E — Warmup acceleration (α + β + γ)

Three independent accelerators that reduce the real-world sample
budget needed to exit warmup-conservatism and engage full-strength
prediction.

  • E1 (α) — Calibrator bootstrap from history: replays up to
    1000 most-recent resolved predictions through the calibrator +
    rule tracker on boot when the persistence layer comes back
    empty. Fresh DB or post-/reset start exits warmup in seconds
    instead of 30+ live resolutions. Idempotent.
  • E2 (β) — Graduated warmup multiplier: replaces the binary
    cliff with sqrt(samples / minSamples) scaling on the
    probability + confidence caps so confidence climbs smoothly
    through warmup instead of unlocking on a single lucky
    resolution.
  • E3 (γ) — Graduated outcome labels for Platt scaling: the
    calibrator now accepts a number ∈ [0,1] in addition to a
    boolean wasCorrect, feeding partial-credit resolution scores
    directly to the SGD update. Empirically halves the sample
    count needed to converge on the same MAE.

F — Quality gates (raise the floor)

  • F1 — Per-family emission floors via
    chronovisor.perFamilyMinLogConfidence
    (micro=0.45, intraday=0.40, swing=0.35, macro=0.30). Short
    horizons (where prod WR sat at 15–32%) get a higher bar.
  • F2 — Sideways dead zone widened from ±0.005 to ±0.04
    around 0.5 — uncertain calls now correctly file as RANGE
    instead of being emitted as razor-thin directional predictions.
  • F3wrWatchdog config drives a rolling-WR warning DM in
    the resolver: when recent decisive WR falls below warnBelow
    over windowSize samples, the watchdog emits an info
    notification naming the worst-performing signal.
  • F4 — Cycle-degraded hard-skip: directional predictions
    with cycleHealth.status ∈ {degraded, unsafe} AND
    systemConfirmations < 3 are now hard-skipped at
    qualification time so they never enter the resolution
    pipeline. Cold-start (warmingUp) is exempt.
  • F5 — RANGE gate label fix: the 🚦 Gate: 0/3 — SKIP
    label was misleading for RANGE plays (sideways forecasts have
    no direction to confirm). RANGE plays now render
    🚦 Gate: RANGE play (3-system gate applies to directional only).

G — Reset hygiene + ghost-DM elimination

Operator bug report 2026-05-22: /reset all and manual sqlite
DELETE runs were leaving stale state that re-emerged as ghost
predictions in the next cycle. Three causes, fixed in one atomic
operation.

  • G1performFullReset(): atomic wipe of every
    prediction-related table including Polaris additions
    (signal_outcomes, signal_polarity, calibration_family,
    calibration_symbol, rule_stats, volatility, weights,
    patterns), every alert rule, every pending notification.
  • G2ChronoVisorEngine.getResetCallbacks(): exposes
    named clear() functions for every in-process registry the
    engine owns (CalibratorRegistry, RuleTracker,
    VolatilityRegistry, SignalPolarityRegistry, pattern library,
    warmup-state cached counts). Without flushing these, the next
    debounced ml-state save wrote the pre-wipe snapshot straight
    back to DB.
  • G3 — Orphan-row guard in sendTelegramDm: before
    delivering a prediction_resolved or price_threshold DM
    the channel checks that the underlying prediction /
    alert-rule row still exists. If not, the DM is suppressed.
    Eliminates the post-wipe ghost bubble. Resolver writes
    predictionId into metadata so the guard has a key.
  • G4/reset all reply gains a visible "VIZZOR STATS
    RESET" divider block so chat scrollback has an unambiguous
    "everything above is stale" marker (Telegram permanently
    keeps history).
  • G5 — Polarity reconciliation cadence: the existing
    CALIBRATION_RETRAIN_INTERVAL now also triggers signal-
    polarity reconciliation from the auditor so a freshly-seen
    sign error is corrected within ~20 resolutions of becoming
    statistically defensible. Manual overrides are preserved.

H — /predict card renderer

Operator-requested compact trader-channel format. One card per
horizon, MarkdownV2-escaped, with branded glyphs for known
assets and a clean fallback for everything else.

₿ BITCOIN · 30m
💰 BTC Price: $77,733.83
💵 Direction: ➖ RANGE (55%)
🪙 Band: $77,656.09 — $77,811.56
💹 Best Play: Range fade — long $77,656.09, short $77,811.56 (no leverage)
🟣 SOLANA · 30m
💰 SOL Price: $87.40
💵 Direction: 📉 SHORT (61%)
🪙 Entry Zone: $87.40 — $87.44
📈 TP1: $87.29 (-0.13%)
📊 SL: $87.54 (+0.16%)
⚠ Skip: R:R 1:0.78 — risk exceeds reward, no trade

Multi-horizon replies separate cards with a horizontal rule and
surface a "Multi-horizon consensus" chip when per-horizon
directions disagree.

symbol-display registry holds the branded glyphs with operator
extension via chronovisor.symbolDisplay.<TICKER> in config.
Unknown symbols fall back to a generic icon + raw ticker —
features still apply universally (no symbol-list gating).


Impact

  • Per-user WR: reports now correctly scope to the calling
    operator's history. Pre-fix, every operator on the same bot
    saw a shared aggregate.
  • Tracked WR pollution: directional emissions on degraded /
    unsafe cycles no longer enter the resolution pipeline. The
    raised per-family floors + widened dead zone push genuinely
    uncertain calls to file as RANGE.
  • Warmup time-to-full-power: bootstrap + graduated cap +
    graduated SGD label cut the practical sample budget by roughly
    half on synthetic + replay tests.
  • Ghost DMs: post-reset ghost bubbles eliminated by the
    orphan-row guard plus the in-memory registry flush.
  • Operator visibility: /wr-diagnose, /signal-override,
    the rolling-WR watchdog DM, and the polarity reconciliation
    DM give the operator a closed-loop view of which signals are
    helping vs hurting.

Breaking changes

None. All new config fields ship with sane defaults; existing
configs without the new keys behave identically to v0.15.0 except
for the higher minLogConfidence floor (0.25 → 0.30) and the
wider sideways dead zone (0.495/0.505 → 0.46/0.54), both of
which are documented as intentional WR-quality changes.

Test plan

  • pnpm typecheck — 0 errors
  • pnpm lint — 0 errors (334 pre-existing non-null-assertion warnings)
  • pnpm test — 133 files, 1630 tests, all green
  • Operator soak: 24h run on prod data with /wr and
    /wr-diagnose polled hourly
  • /reset all end-to-end: confirm no ghost DMs land after
    the wipe completes
  • /predict BTC 30m from two different operator accounts
    and confirm /wr shows distinct per-user histories

Round 2 — 2026-05-23 (additional commits)

Follow-up sprint after operator MISS data from 2026-05-22/23 showed
that BTC tracked WR was still sitting at ~37% with RANGE forecasts
landing 0.01–0.17% outside the displayed band and SMC self-
contradictions (bearish bias + supply zone, yet bullish vote) leaking
into directional emissions. Round 2 closes those specific failure
modes and adds the leading-edge signals + calibration acceleration
that didn't make Round 1.

Quick wins (ω₁–ω₆)

  • ω₁ — Dead ICT session + no active flow + thin confirmations
    hard-skipped at qualification time (was advisory).
  • ω₂ — SMC self-contradiction detector neutralizes the SMC vote
    when smc.vote disagrees with both smcLastBreakType and
    smcBias.
  • ω₃ — Universal R:R<1.0 hard-skip across all tracking profiles.
    Negative-EV trades never reach chat regardless of profile.
  • ω₄/precisions and /wr render "— (n/m, warming up)"
    until a calibrator has at least 10 decisive resolutions. Stops
    the bot from publishing 14.5% / 22% noise WR figures.
  • ω₅ — Realized-vol features (volTermSlope, volAcceleration)
    feed band-magnitude lift in composeMomentumDirection (direction
    untouched — vol is a magnitude indicator only).
  • ω₆/predict card gains a "🔬 Why" section per emission:
    per-signal CF contributions, triple-system status, guardrails
    snapshot, short reasoning paragraph. On by default.

Calibration acceleration (δ ε θ)

  • δ — Fresh per-symbol Platt shards inherit the family's
    calibration parameters via inheritFamilyPriorIntoShard() instead
    of starting cold. New tokens skip the ~48 h warmup window.
  • ε — Calibrator now keeps per-(family, regime) shards in
    addition to per-family and per-(symbol, family). Each resolution
    routes to all four buckets so a bull-trained curve no longer
    poisons bear emissions.
  • θupdatePlatt SGD step is scaled by |p − 0.5| × 2 + 0.5.
    High-confidence misses correct the calibrator harder than low-
    confidence ones.

New free-tier signals (H K L N π σ τ)

  • H — Cross-venue funding aggregator: Binance + Bybit + OKX
    public funding-rate endpoints, 30 s cache, exposes weighted
    average + binanceDivergencePct + maxDivergencePct. Drives
    three new FOL rules around single-venue over-leverage.
  • K — ETH gas spike detector via Etherscan free tier: 5 min
    baseline + 3× spike threshold. A 3×-baseline gas spike inside
    5 min historically precedes ETH liquidation cascades by 30–90 min.
  • L — BTC mempool fee pulse via mempool.space: 5 min baseline +
    isCongested threshold. Fee acceleration is a 30–90 min
    leading volatility signal.
  • N — Volume profile signal: 50-bucket profile, Point of
    Control (POC), 70%-volume value area, insideValueArea flag.
    Pure compute on existing klines. Drives
    volume_profile_above/below_va_mean_revert_* rules.
  • π — Multi-timeframe CVD divergence: parallel 5 m + 1 h kline
    fetch, classifies 5m-bullish / 1h-bearish (and mirror) as multi-
    TF divergence — smart-money distribution vs retail pump signal.
  • σ — Funding-payment-timing rules: ±15 min around 00:00 /
    08:00 / 16:00 UTC funding settlements emit pre-payment squeeze
    and post-payment relief / capitulation calls.
  • τ — Session-momentum continuation detector: ICT kill-zone
    alignment (London / NY) combined with active leading flow
    emits session-continuation rules.

Rationale

The Round 2 surface targets the exact MISS pattern in the operator's
2026-05-22/23 screenshots: dead-session emissions, SMC self-
contradictions reaching chat, R:R<1 trades labeled "Tentative",
WR figures published from <10-sample shards, and missing leading-
edge signals (cross-venue funding, gas / mempool, multi-TF flow)
that would have flagged the BTC RANGE breaks before they happened.

Verification

  • pnpm typecheck — 0 errors
  • pnpm lint — 0 errors (345 pre-existing non-null-assertion warnings)
  • pnpm test — 133 files, 1630 tests, all green

Round 2 commits

  • 6b7da0f — fix(ml): hard-skip dead-session + R:R<1 emissions (ω₁ ω₃)
  • eb4ef8f — feat(telegram): min-sample-gated WR + full-context /predict (ω₄ ω₆)
  • dbb27b1 — feat(ml): realized-vol band sizing + multi-TF CVD divergence (ω₅ π)
  • 2187247 — feat(ml): regime + cross-symbol + confidence-weighted calibration (ε δ θ)
  • 9d61dfb — feat(data): cross-venue funding + ETH gas + BTC mempool sources (H K L)
  • 4be9c7e — feat(ml): volume profile + FOL rules + engine wiring for Round 2 (ω₂ + N + H I J K L M N π σ τ)

Round 2.5 — 2026-05-24 (additional commits)

After Round 2 shipped, live operator data showed BTC tracked WR
sitting at ~32% on n=10 decisive resolutions. The number itself is
inside the 95% CI of a coin flip at that sample size, but examining
the emission stream surfaced five concrete gaps Round 2 did not
close — all of which let weak-edge predictions reach chat regardless
of how good the calibrator gets.

Gap 1 — emission-floor bump (config)

perFamilyMinLogConfidence floors raised:
micro 0.45→0.58 · intraday 0.40→0.55 · swing 0.35→0.52 · macro 0.30→0.50.
A 55%-probability prediction cannot resolve at 70% even with perfect
calibration — the math forbids it. Volume drops ~40%, surviving
emissions resolve with meaningfully better EV. Floors lower back to
Round 2 values once decisive samples per family cross 60 and
calibration is trusted across all 3 regime shards.

Gap 2 — adversarial signal-conflict detector (ι)

New src/core/chronovisor/signal-conflict.ts scans the
onChain × logicRules pair and flags conflict when both |CF| ≥ 0.6 AND
their signs oppose. Engine hook fires before the existing
systemConfirmations / targetEqualsEntry normalize-to-sideways block:
on conflict the prediction is reframed as RANGE, probability is
collapsed toward 0.5–0.6, the forecast is recomputed as sideways,
and a signal_conflict reasoning line is threaded into the trigger
snapshot. Catches the exact SMC-bearish-vs-ICT-bullish-vs-RANGE-call
pattern in the operator's old MISS screenshots.

Gap 3 — composite EV gate

ω₃ catches R:R<1 and the per-family floor catches low-confidence calls
but a 56% / R:R 1.05 combo escapes both axes individually with
near-zero edge. Added prediction-qualification.ts rule:
skip if (probability − 0.5) × edgeRatio < 0.04 for directional
predictions with R:R ≥ 1.0. Knocks out the marginal positive-RR but
low-edge tail that was hitting the calibrator with coin-flip outcomes.

Gap 4 — regime-aware emission throttle

Round 2's ε regime-bucketed calibration LEARNS per regime but didn't
THROTTLE EMISSIONS in a known-bad regime. Added:

  • ConfidenceCalibrator.getHitRate() — cumulative bin-aggregate WR
  • CalibratorRegistry.getRegimeShardHitRate(family, regime)
  • shouldTrackPrediction accepts regimeShardHitRate + regimeShardSamples + regimeThrottleRoll. Throttle fires when
    samples ≥ 20 AND hitRate < 0.50 AND uniform random roll < 0.5 →
    50% emission cut in known-bad regime.
  • Engine maps currentRegime ∈ {bull, bear, chop} →
    CalibrationRegime ∈ {bull, bear, consolidation}, queries the
    shard's hit rate, and passes Math.random() per call so the
    throttle is probabilistic (not a hard veto). Tests inject a
    deterministic roll for repeatability.

Gap 6 — bootstrap calibrator from 30-day time window

Round 1's α bootstrap pulled the most recent 1000 resolved rows.
The count-based bound was wrong-shaped for low-volume operators
(may produce <1000 rows in 30+ days) and high-volume operators
(1000 rows may span only 8h and miss regime variety). Replaced
with a time-windowed pull: 30-day window by default
(VIZZOR_BOOTSTRAP_WINDOW_DAYS) capped at 5000 rows
(VIZZOR_BOOTSTRAP_MAX_ROWS). Zero behavior change for mid-volume
operators with <5000 rows in 30 days.

Round 2.5 commits

  • 8183d89 — feat(config): raise per-family emission floors (Gap 1)
  • d5c662b — feat(ml): bootstrap calibrator from 30-day time window (Gap 6)
  • 93efddd — feat(ml): adversarial signal-conflict detector ι (Gap 2)
  • 7bff8d5 — feat(ml): composite EV gate + regime-aware emission throttle (Gap 3 + Gap 4)

Verification

  • pnpm typecheck — 0 errors
  • pnpm lint — 0 errors (345 pre-existing non-null-assertion warnings)
  • pnpm test — 133 files, 1630 tests, all green

Deferred to Round 3

Gap 5 (bring up the Python ML sidecar) is environmental, not a
code change — handled outside the PR. The combined expected WR lift
from Gaps 1–4 + Gap 6 once 20+ samples accumulate is roughly +10pp
to +15pp on the tracked-decisive cohort, primarily by reducing
low-edge emission volume rather than improving any single signal.


Round 2.6 + Round 3.0 — 2026-05-25 (additional commits)

After Round 2.5 shipped the operator's WR snapshot (/precisions) showed
tracked decisive WR at 28% on n=71 — the qualification gate was selecting
WORSE predictions than the broader 32% cohort, because every Round 2.5
gate was directional-only and RANGE plays at 50% probability flooded
through. Round 2.6 closes the RANGE gap. Round 3.0 adds four free-tier
leading-indicator signals worth an estimated +7–12pp combined WR lift.
A small sweep at the end relocates two hardcoded symbol lists to config.

Round 2.6 — RANGE quality bar

The 45m BTC HIT card surfaced the bug: RANGE 50% · 0/3 systems · dead session · Tier: Tracked. A random-walk play counted toward WR. Round 2.6
adds three sideways-specific gates inside shouldTrackPrediction:

  • 2.6.1a RANGE probability floor 0.58 — a 50% RANGE call carries no
    information; only emit when the model has at least mild conviction.
  • 2.6.1b RANGE signal-confirmation floor 2 — need ≥ 2/6 supporting
    signals; a single signal is noise.
  • 2.6.2 Min-band-width vs ATR gate — skip when band half-width
    < 0.5× ATR(14). Bands narrower than half a 1-bar range have ~50/50
    "actual inside" by random walk; no edge.

Expected: ~50% drop in RANGE emission volume, surviving RANGE plays
carry real information, WR denominator stops being inflated by random
HITs.

Round 3.0 — four new free-tier signals (+10 FOL rules)

# Signal Source New FOL rules
H Funding-rate 30d historical z-score New SQLite table funding_history fed by every cross-venue-funding fetch; computeFundingZScore exposes z-score per venue (Binance / Bybit / OKX, 30-day window, lazy-pruned 35d retention) funding_zscore_extreme_long_bearish (z > +2 → SHORT), funding_zscore_extreme_short_bullish (z < -2 → LONG)
D Deribit options term structure + skew Free get_book_summary_by_currency + get_index_price, no auth. ATM IV per tenor (1-7d, 8-30d), term inversion flag, put/call OI ratio, 25-delta-proxy skew options_iv_term_inverted_volatility_imminent, options_put_skew_extreme_bearish_reversal_imminent, options_call_skew_extreme_bullish_top, options_iv_crush_post_event_relief_bullish
X Hyperliquid DEX positioning Free metaAndAssetCtxs, no auth. Per-asset funding rate + open interest. DEX-vs-CEX funding divergence flags smart-money leverage concentration hyperliquid_funding_long_skew_bearish (HL funding >20% above Binance → SHORT), hyperliquid_funding_short_skew_bullish (HL >20% below → LONG)
R Reddit + 4chan /biz/ retail sentiment Reddit r/cryptocurrency + r/cryptomoonshots .json (UA header only); 4chan /biz/catalog.json. Lexicon-based sentiment + mention-spike z-score per ticker, 6h rolling window retail_euphoria_top_warning_bearish (mention spike >3σ + sentiment > 0.7 → SHORT), retail_capitulation_bottom_warning_bullish (mirror)

All four sources are free, no Nansen / Arkham / Glassnode-paid per
feedback_whale_flow_edge.md. Each gracefully returns null/undefined
for assets the source doesn't cover, and the dependent FOL rules skip
cleanly via typeof guards — no false fires, no degradation for symbols
outside coverage.

Sweep — hardcoded symbol lists → config

Two source-level hardcoded coin lists relocated to config-driven
defaults (per feedback_no_hardcoded_symbols.md):

  • MAJOR_SYMBOLS (collector.ts, 23 tokens) → collector.symbols
  • QUICK_SYMBOLS (telegram/ui-helpers.ts, 7 tokens) → telegram.quickSymbols

getCollectorDefaults() and getQuickSymbols() read the values via
getConfig() at call time with safe pre-config-load fallbacks so an
operator can edit YAML + reload without restart. Runtime Telegram
commands (/collector-symbols, /quick-symbols) deferred to Round 3.1
to complete the spirit of feedback_no_file_edits_for_operator_config.md.

Round 2.6 + 3.0 commits

  • 9dafb16 fix(ml): range quality bar + min-band-width gate (Round 2.6)
  • 8667f23 feat(data): funding-rate 30-day history + z-score (Round 3.0 H)
  • ae7d615 feat(data): deribit options term structure + skew (Round 3.0 D)
  • 3229cad feat(data): hyperliquid DEX positioning + funding divergence (Round 3.0 X)
  • 1ac2fda feat(data): reddit + 4chan retail sentiment NLP (Round 3.0 R)
  • 981f4f9 refactor(config): move MAJOR_SYMBOLS + QUICK_SYMBOLS to config-driven defaults

Verification

  • pnpm typecheck — 0 errors
  • pnpm lint — 0 errors (352 pre-existing warnings)
  • pnpm test — 133 files, 1630 tests, all green

Round 3.1 + 3.2 — 2026-05-25 (post-restart hygiene)

Operator restart bug: after killing the bot, wiping chronovisor_predictions

  • alert_rules (or only one of them), and restarting, old DMs and old
    auto-armed alerts still fired
    . The Round 1 G3 orphan-row guard was only
    applied at notification ENQUEUE time, and prediction-armed alert rules
    survived unless the operator explicitly ran /reset alerts. Two fixes close
    the gap at every surface.

Round 3.1 — session-scoped DM delivery + orphan re-guard at poller

Problem: DMs sit in pending_notifications between enqueue and poller
drain. The G3 orphan guard fires only at enqueue. When the bot is killed
mid-flight and predictions are wiped before restart, the poller picks up
stale rows on the next session and delivers them referencing rows that no
longer exist.

Layer 1 — session expiry at startup. New
expirePendingDmsOlderThan(beforeMs) in pending-queue.ts marks-delivered
every undelivered row with created_at < beforeMs. The poller captures
Date.now() at boot and runs the expiry once. Anything queued by a previous
session is discarded instead of spuriously delivered. Logs:
Expired N stale pending Telegram DM(s) from prior session(s).

Layer 2 — orphan re-guard at delivery. New metadata TEXT column on
pending_notifications (idempotent ALTER, null for legacy rows).
sendTelegramDm passes notification.metadata through. The poller parses
metadata before each send and re-checks chronovisor_predictions /
alert_rules existence — if the underlying row is gone the DM is marked
delivered without sending. Mid-session deletes are caught even when the
DM was already queued.

Round 3.2 — prune orphaned prediction-armed alert rules at startup

Problem: Predictions arm price-threshold alert rules with id
pred_<predictionId>[_upper|_lower|_label]. Manual SQL wipes that only
delete chronovisor_predictions leave these rules dangling. On the next
session the price-alert-bridge keeps firing them against live prices and
the operator sees 42 system alerts on /alerts for predictions that no
longer exist.

Fix: New pruneOrphanedPredictionAlertRules() in
notifications/store.ts runs once at bot startup:

  1. SELECT id FROM alert_rules WHERE id LIKE 'pred_%'
  2. Strip pred_ prefix and _upper/_lower/_tp1/_sl suffix to extract
    the underlying predictionId
  3. Cross-check chronovisor_predictions; collect IDs whose row is gone
  4. Batch DELETE all orphans in one statement

Manual user alerts (no pred_ prefix) are left untouched. Logs:
Pruned N orphaned prediction-armed alert rule(s).

Round 3.1 + 3.2 commits

  • 9846f56 fix(telegram): session-scoped DM delivery + orphan re-guard at poller (Round 3.1)
  • ba52277 fix(telegram): prune orphaned prediction-armed alert rules at startup (Round 3.2)
  • fdfd14b feat(telegram): refresh /start welcome + first-run nudge with chat-first framing

Verification

pnpm typecheck   # 0 errors
pnpm lint        # 0 errors
pnpm test        # 1630/1630 passing

Manual operator test:

  1. Kill bot, manually DELETE FROM chronovisor_predictions, restart
  2. /alerts after startup → should show 0 system alerts (was 42)
  3. Bot log shows Pruned N orphaned prediction-armed alert rule(s) and
    Expired N stale pending Telegram DM(s) lines
  4. Run a fresh /predict BTC 1h, kill bot mid-resolution, wipe predictions,
    restart → resolved-DM does NOT fire (orphan guard at delivery suppresses)

Round 3.3 — 2026-05-25 (production hardening: Sprint A + B)

After the production-deployment audit flagged 5 BLOCKERs + 12 HIGHs,
this sprint clears all 9 critical items so the branch is shippable.
Post-launch items (coverage thresholds, runbook, MEDIUM/LOW) are
intentionally deferred until the bot is in production where the
operator can observe and prioritize them with real data.

Sprint A — Pre-launch blockers

A1 — env-var validation fail-fast (config/loader.ts)
New validateProductionConfig() triggers when NODE_ENV=production
OR VIZZOR_REQUIRE_FULL_CONFIG=true. Throws at boot when
ANTHROPIC_API_KEY, TELEGRAM_BOT_TOKEN, or ML_API_SECRET
(when ml.enabled) is missing. Dev runs without keys still work.

A2 — Sentry error tracking (utils/sentry.ts)
New @sentry/node integration gated on SENTRY_DSN. Captures
unhandledRejection + uncaughtException before the existing stderr
log. PII scrub redacts Authorization-style headers. Graceful
flush on SIGTERM. No-op when DSN unset.

A3 — security patches (pnpm up)
Cleared 6 moderate + 1 high advisories (postcss XSS, brace-expansion
DoS, ws uninitialized-memory disclosure, ...). Remaining 2 moderate

  • 1 high sit in deep transitive paths via @solana/web3.js — at the
    floor available without dropping wallet-chain support.

A4 — hot-path DB indices
Added composite + partial indices on chronovisor_predictions
(symbol+created_at, resolved_at, user_id+symbol, source+resolved_at),
alert_rules (enabled+type), and a standalone funding_history
(fetched_at) so the lazy-prune query becomes a range scan instead
of a table scan once the table grows past ~100k rows.

A5 — /health auth (telegram/health-server.ts)
Optional bearer-token gate via VIZZOR_HEALTH_TOKEN. Constant-time
comparison. Returns 401 with WWW-Authenticate challenge. Open
endpoint preserved when env var unset (back-compat with loopback-
only setups).

Sprint B — Pre-launch reliability

B1 — Prometheus metrics endpoint (utils/metrics.ts)
New /metrics endpoint gated on VIZZOR_METRICS_PORT. Exposes
default process metrics + four counters
(vizzor_predictions_emitted_total, vizzor_predictions_resolved_total,
vizzor_telegram_dm_delivered_total,
vizzor_data_source_fetch_total) + an
vizzor_engine_cycle_duration_seconds histogram. Recorders are
small wrappers so call sites never import prom-client directly.

B2 — retry / backoff / circuit breaker (utils/fetch-resilience.ts)
New resilientJsonFetch(url, source, opts) adds 2 exponential-
backoff retries on 5xx + timeouts, per-source circuit breaker (5
consecutive failures → 60s open with half-open probe on recovery),
and metrics on every call. Applied to Deribit options, Hyperliquid
positioning, and Reddit + 4chan retail-sentiment adapters. Remaining
stable upstreams (Bybit, OKX, Etherscan, mempool.space) deferred
to a follow-up — changing them in the same commit increased blast
radius without matching reward.

B3 — image version pins (docker-compose.yml)
postgres: timescale/timescaledb:latest-pg15 → 2.21.4-pg15,
n8n: latest → 1.117.4. Bumps now require staging validation +
backup; no more silent breaking changes via floating tags.

B4 — n8n healthcheck (docker-compose.yml)
Probe against n8n's built-in /healthz. Container no longer
reports "up" when the internal process is crashed.

Round 3.3 commits

  • bc0ba42 feat(config): fail-fast on missing production env vars (A1)
  • 28a3c0f feat(ml): sentry error tracking with graceful flush on shutdown (A2)
  • d90502e chore(deps): patch security advisories via pnpm up (A3)
  • 5313683 perf(data): hot-path indices on predictions / alert_rules / funding_history (A4)
  • 9de0ae0 feat(telegram): optional bearer-token auth on /health endpoint (A5)
  • 7a60c22 feat(ml): prometheus metrics endpoint with engine + delivery counters (B1)
  • 1222052 feat(data): retry / backoff / circuit-breaker on Round 3.0 data sources (B2)
  • 3d4f701 chore(deps): pin postgres + n8n image versions, add n8n healthcheck (B3 + B4)

Production env vars introduced

Var Required Effect when set
NODE_ENV=production yes Activates env-var fail-fast (A1)
VIZZOR_REQUIRE_FULL_CONFIG=true no Opt-in to A1 outside prod (staging)
SENTRY_DSN recommended Activates Sentry error capture (A2)
SENTRY_TRACES_SAMPLE_RATE no Defaults to 0 (no tracing)
VIZZOR_HEALTH_TOKEN recommended Bearer-auth on /health (A5)
VIZZOR_METRICS_PORT recommended Activates /metrics server (B1)
VIZZOR_METRICS_HOST no Defaults to 127.0.0.1

Verification

  • pnpm typecheck — 0 errors
  • pnpm lint — 0 errors
  • pnpm test — 133 files, 1630 tests, all green
  • pnpm audit --prod — 2 moderate + 1 high (1 ignored), all deep
    transitive via @solana/web3.js (upstream tracking issue)

Deferred to post-launch (operator observes + prioritizes)

  • Recording at call sites for the metrics counters defined in B1
    (engine.predict, prediction-resolver, pending-poller, data
    sources) — infrastructure ships first, then per-module recorders
  • Cross-venue funding + ETH gas + BTC mempool migration to
    resilientJsonFetch (stable upstreams; lower priority)
  • CI coverage threshold enforcement
  • Backup / restore runbook in DEPLOYMENT.md
  • Rate-limit map memory bounds
  • Runtime Telegram commands for collector/quickSymbols config
  • /reset notifications and /reset config commands

imzzaidd added 30 commits May 21, 2026 15:10
Brings PR #34-#43 hotfixes into develop so feat branches can build
on top of the band-gating, range-resolution, and SQLite-mount fixes
that shipped via direct-to-main hotfixes.
normalized table holding one row per (resolved prediction, contributing
signal). enables fast per-signal decisive WR queries without parsing
signal_snapshot JSON on every read. foundation for the polaris signal
auditor and polarity registry — a signal whose hit rate sits chronically
below 0.40 over 30+ samples is a sign-inversion candidate.
polaris diagnostics layer. enables per-signal WR-driven sign inversion
and dampening without retraining.

- signal-auditor: classifies each signal as INVERT/DAMPEN/WATCH/KEEP
  based on rolling decisive WR (threshold <0.40 over 30+ samples →
  inversion candidate).
- signal-polarity: runtime registry consulted by engine before composite
  assembly; supports auto (auditor-driven) and manual (operator-pinned)
  overrides. Persisted to signal_polarity sqlite table.
- realized-priors: empirical bayesian prior derived from the recent
  realized-direction distribution per family. falls back to static
  familyAwarePrior when the sample is too small to trust.
- /wr-diagnose telegram command: per-signal WR breakdown with inversion
  / dampen / watch sections + active overrides.
- /signal-override telegram command: operator-pinned polarity wins over
  auto auditor.
- pnpm wr:audit cli: signal-level WR audit; exits non-zero when
  inversion candidates exist so CI can flag.
polaris predictiveness layer — engine now anticipates moves via leading
indicators instead of reacting to lagging structure.

- leadingFlow signal aggregator: taker momentum, orderbook imbalance,
  whale flow, CVD slope, funding inflection. injected with 30% weight
  at the FRONT of the onchain composite via new sigCfg.onChain.
  leadingFlowWeight.
- cvd-gated SMC vote: when SMC reads bullish demand but cvd's recent
  slope is bearish (demand is being eaten), neutralize the structural
  vote instead of letting it outvote real-time order-flow truth.
- 8 reversal FOL rules: failed_support_break, failed_resistance_rejection,
  asian_range_continuation (bull/bear), funding_peak_reversal_bearish,
  funding_trough_reversal_bullish, sweep_and_reverse_bullish_trap,
  sweep_and_reverse_bearish_trap.
- 4 structural FOL rules: stable_supply_expanding_bullish + contracting_
  bearish, alt_rotation_confirmed_alt_bullish, btc_rotation_dominant_btc_
  bullish (TOTAL2/TOTAL3 dominance algebra).
- 2 volatility FOL rules: vol_term_inverted_break_imminent,
  vol_acceleration_post_consolidation (realized vol + term structure).
- realized-volatility signal: short (12h) vs long (48h) realized σ +
  acceleration, pure compute on existing OHLCV.
- MarketContext gains spotPrice, stablecoinNetExpansionUsd, altDominancePct,
  nonBtcEthDominancePct, realizedVol1h/4h, volTermSlope, volAcceleration.
Polaris D + F. Post-processes the predicted forecast range so the
band edges absorb two predictable wick patterns the prior grader
counted as MISS:

- applyLiquidationMagnet: when a significant long/short liquidation
  cluster sits beyond the band edge on the matching side, stretch the
  edge to (cluster ± small buffer) capped at 2x the original half-
  width. Symbol-agnostic — works off liquidationProximity data the
  engine already gathers, no hardcoded ticker list.
- applyMomentumBandWidth: when taker momentum + orderbook imbalance
  derivative + CVD short slope stack one way with magnitude >= 0.25,
  widen the band on that side by up to 1.8x while modestly widening
  the other edge. Captures the BTC 30m MISS pattern where bearish
  momentum stacked but the band was sized for a quiet hold.

Both helpers run AFTER computePredictionForecast closes, on a shallow
copy of the forecast, so callers can chain them without ownership
concerns. Pure pricing math — no scoring side effects.

Includes the placement fix in engine.ts: the two import-and-apply
blocks live AFTER the forecast finalization, not inside the composite
pipeline.
Operator bug report 2026-05-21: the per-user filters in
/precisions, /wr, and /predictions were collapsing to "show every
row across all users". The columns existed and the read paths
filtered correctly — but the write path never populated user_id,
so every row was logged as NULL and the OR-clause widened the
result to everything in the database.

This commit closes the loop end-to-end:

- accuracy-tracker.logPrediction: INSERT now includes user_id and
  reads PredictionRecord.userId. Migration adds the column for
  fresh DBs / unit tests that construct AccuracyTracker ahead of
  the multi-user migration run.
- PredictionRecord gains an optional userId field with full
  rationale in the doc comment.
- engine.predict() gains an options.userId that flows through to
  every logPrediction call site (forecast, directional with TP/SL,
  forecast resolve).
- All Telegram and AI tool-call sites that originate from an
  authenticated user thread userId through:
  - /predict, /diagnose, /polymarket commands
  - get_prediction, predict_price, polymarket_edge tools
  - bulk-ladder predict calls
- New regression test verifies that a per-user predict call writes
  the user_id column.

Scheduler broadcasts and CLI/TUI invocations leave userId
undefined, so the row is NULL and the OR-clause keeps shared
broadcasts visible to every watchlist-matching subscriber.
Operator-requested trader-channel format. One compact card per
horizon, MarkdownV2-escaped, with branded glyphs for known assets
and a clean fallback for everything else.

Cards in two shapes:

  ₿ BITCOIN · 30m
  💰 BTC Price: $77,733.83
  💵 Direction: ➖ RANGE (55%)
  🪙 Band: $77,656.09 — $77,811.56
  💹 Best Play: Range fade — long $77,656.09, short $77,811.56 (no leverage)

  🟣 SOLANA · 30m
  💰 SOL Price: $87.40
  💵 Direction: 📉 SHORT (61%)
  🪙 Entry Zone: $87.40 — $87.44
  📈 TP1: $87.29 (-0.13%)
  📊 SL: $87.54 (+0.16%)
  ⚠ Skip: R:R 1:0.78 — risk exceeds reward, no trade

Multi-horizon replies separate cards with a horizontal rule and
surface a "Multi-horizon consensus" chip when per-horizon directions
disagree, so the operator sees incoherent calls before placing a
trade.

Files:
- src/telegram/render-predict-card.ts: pure renderer
- src/telegram/symbol-display.ts: glyph + name registry. Built-in
  table for the most common assets, with operator extension via
  chronovisor.symbolDisplay.<TICKER> in config. Unknown symbols
  fall back to a generic icon + raw ticker — features still apply
  universally (no symbol-list gating).
- Config schema gains the optional symbolDisplay record.
- /predict and the predict-keyboard callback both swap from
  renderChronoVisorResult to renderPredictCard.
Exposes the signal auditor as `pnpm wr:audit` so it can run from CI
or local one-shot diagnostics. The script (scripts/wr-audit.ts)
landed in the earlier polaris diagnostics commit; this entry just
makes it discoverable through the package.json scripts surface.

The audit exits non-zero when any active signal sits below the
INVERT threshold (decisive WR < 0.40 over 30+ samples), so CI can
gate releases on per-signal calibration health.

The companion /wr-diagnose and /signal-override Telegram commands
were registered in the polarity-registry commit; no further wiring
is needed here.
…dm guard

Operator bug report 2026-05-22: /reset all and manual sqlite DELETE
runs left stale state behind that re-emerged as ghost predictions
in the next cycle. Three causes, fixed in one atomic operation:

1) Polaris-era tables survived the wipe. signal_outcomes,
   signal_polarity, calibration_family, calibration_symbol,
   rule_stats, volatility, weights, and patterns all needed
   explicit DELETE statements that the legacy resetPredictionHistory
   path didn't know about.

2) In-memory registries kept their snapshots. CalibratorRegistry,
   RuleTracker, VolatilityRegistry, SignalPolarityRegistry, the
   pattern library, and warmup-state's cached counts all held a
   live view of the pre-wipe state — which the next debounced
   ml-state save then wrote BACK to DB, undoing the wipe.

3) In-flight resolved-DMs from prior pending_notifications kept
   landing after the wipe completed, making it look like
   predictions had resurrected.

This commit closes all three:

- New full-reset module wraps the wipe atomically: every prediction-
  related table, every alert rule, every pending notification.
- ChronoVisorEngine.getResetCallbacks() exposes named clear()
  functions for every in-process registry the engine owns.
  WarmupState.reset() + RuleTracker.clear() were added so the
  rule-stat / cached-count surfaces can be flushed.
- Telegram outbound channel adds an orphan-row guard: before
  delivering a prediction_resolved or price_threshold DM it checks
  that the underlying row still exists. If not, the DM is
  suppressed with a debug log — eliminating the post-wipe ghost
  bubble. The resolver writes predictionId into metadata so the
  guard has a key to look up.
- /reset all reply gains a visible "VIZZOR STATS RESET" divider
  block so the chat scrollback has an unambiguous "everything
  above is stale" marker (Telegram permanently keeps history).
Polaris α + β + γ. Three independent accelerators that reduce the
real-world sample budget needed to exit the warmup-conservatism
regime and engage full-strength prediction.

α — Calibrator bootstrap from history

  loadMlState() now replays up to 1000 most-recent resolved
  predictions through the calibrator + rule tracker on boot when
  the persistence layer comes back empty. A fresh DB or post-/reset
  start that already has historical resolutions in
  chronovisor_predictions exits warmup in seconds instead of waiting
  30+ live resolutions. Idempotent: a non-empty
  chronovisor_calibration_family table short-circuits the bootstrap.

β — Graduated warmup multiplier

  Original behaviour was binary: below minSamples the cap was applied
  at full strength; at minSamples it dropped entirely. This produced
  an "11-to-12 sample cliff" where one lucky resolution unlocked
  full-strength confidence. The graduated relaxation now scales the
  cap with sqrt(samples / minSamples):

    0  samples → 0.0  (full cap applied)
    3  samples → 0.5  (50% applied)
    12 samples → 1.0  (cap removed)

  applyProbabilityCap and applyConfidenceCap both consume the
  multiplier so probability and confidence climb smoothly.

γ — Graduated outcome labels for Platt scaling

  ConfidenceCalibrator.addOutcome now accepts a number ∈ [0,1] in
  addition to the boolean wasCorrect. A graduated resolution score
  (0.5 = neutral, 0.65 = near-miss, 1.0 = full hit) feeds the SGD
  update directly, nudging the curve proportionally instead of as
  binary 0/1 corrections. Empirically converges the Platt curve in
  roughly half the sample count for the same MAE. The histogram
  bin still uses the 0/1 convention (score >= 0.5 counts as a hit)
  so calibration plots remain interpretable.

Per-family configuration

  Production data (May 2026) showed scalp-15m WR at 31.9% and
  intraday-1h WR at 14.5%. Per-family emission floors are now
  configurable via chronovisor.perFamilyMinLogConfidence
  (micro=0.45, intraday=0.40, swing=0.35, macro=0.30) and the
  engine reads through getMinLogConfidenceForFamily so the floor
  is applied at qualification time.

  Sideways dead zone widened from ±0.005 to ±0.04 around 0.5 — the
  razor-thin band was emitting directional calls at probabilities
  barely distinguishable from a coin flip. Widening to 0.46/0.54
  forces uncertain calls to file as `sideways` (RANGE in the
  renderer) which is what they always were.

  New wrWatchdog config drives a rolling-WR warning notification
  in the resolver: when the recent decisive WR falls below
  warnBelow over windowSize samples, an info DM names the worst-
  performing signal so the operator can pause or override it.

Polarity reconciliation cadence

  The existing CALIBRATION_RETRAIN_INTERVAL now also triggers
  signal-polarity reconciliation from the auditor, so a freshly-
  seen sign error is corrected within ~20 resolutions of becoming
  statistically defensible. Manual overrides remain untouched.

Per-prediction signal_outcomes

  Resolutions write per-signal vote/outcome rows to the
  signal_outcomes table the diagnostics layer landed earlier,
  feeding the auditor, /wr-diagnose, and the polarity registry.
  Best-effort: a write failure must never block the resolution
  path.
…NGE gate label

Operator MISS data 2026-05-21/22: every BTC directional and RANGE
MISS in the screenshots had Gate=0/3 or 1/3 with cycleHealth in
either `degraded` or `unsafe`. The qualification gate downgraded
these to advisory tier but the resolver still surfaced them in chat
as "Tier: Tracked" via the resolved-DM path, so they still polluted
the visible WR.

shouldTrackPrediction now hard-skips directional predictions when
cycleHealth.status is `degraded` or `unsafe` AND systemConfirmations
is below 3, returning `track: false` with state `skip` so they
never enter the resolution pipeline at all. The eligibility check
is gated on `!warmingUp` so cold-start predictions still flow
through the relaxed-floor warmup path.

Companion fix in render-crypto-signal: the "🚦 Gate: 0/3 — SKIP"
label was misleading for RANGE plays. systemConfirmations is 0 by
definition on a sideways forecast (no direction to confirm), so
the operator was reading SKIP on a tracked-tier RANGE play and
assuming the engine had ignored the gate. RANGE plays are graded
on the band, not on system votes, so the renderer now emits
"🚦 Gate: RANGE play (3-system gate applies to directional only)".
Polaris patch release. Bundles the per-user WR fix, the /reset
ghost-DM elimination, the /predict card renderer, graduated warmup
acceleration, cycle-degraded hard-skip, and the band-deformation
helpers into a single patch release on top of 0.15.0.

- root package: 0.15.0 → 0.15.1
- web/package: 0.15.0 → 0.15.1
- web/lib/constants APP_VERSION fallback: 0.15.0 → 0.15.1
- test/unit/utils/package.test.ts now reads the canonical version
  from package.json instead of hard-coding it, so the assertion
  no longer goes stale on every patch bump.
ω₁ — dead ICT session + no active flow + thin confirmations now routes
to state='skip' instead of advisory. The 2026-05-22/23 BTC RANGE MISSes
all fired during dead session with 1/3 systems agreeing.

ω₃ — universal R:R<1.0 hard-skip across all tracking profiles. Balanced
mode previously tolerated negative-EV trades as 'Tentative'; operator
MISS data (SOL 45m SHORT R:R 1:0.67 → +2.04% opposite) confirms these
should never reach chat.

Updates the balanced-mode edge_too_thin test to match the new policy:
R:R 0.3 now asserts track=false / state='skip' / reason starts with
rr_below_one_negative_ev.
ω₄ — /precisions and /wr render "— (n/m, warming up)" until a
calibrator has at least 10 decisive resolutions. Stops the bot from
flashing 14.5% / 22% WR figures that are statistical noise from a
handful of early samples.

ω₆ — /predict card now includes a "🔬 Why" section per emission with:
  • per-signal CF contributions (which signals fired, which agree)
  • triple-system status (Vizzor TA + SMC + ICT bias / kill-zone)
  • guardrails snapshot (gate result, R:R, ATR sanity)
  • short reasoning paragraph
renderPredictCard now takes an includeWhy option, on by default in
the /predict command path.
ω₅ — MomentumBandInputs now accepts volTermSlope + volAcceleration.
composeMomentumDirection applies a volMagnitudeLift derived from
realized-vol features so the band widens when 1h realized σ is rising
into post-consolidation breakouts. Direction is untouched — only band
magnitude reacts (vol is a magnitude indicator, not a directional one).

π — CVD signal now fetches 5m and 1h klines in parallel and exposes
hourlySlope + multiTfDivergence. A 5m-bullish / 1h-bearish split is
classified as bearish_multi_tf (retail pump vs smart-money
distribution) and the mirror as bullish_multi_tf. Drives the new
multi_tf_cvd_divergence_* FOL rules.
…δ θ)

ε — Platt calibrator now keeps per-(family, regime) shards in addition
to the existing per-family and per-(symbol, family) shards. Regimes
(bull / bear / consolidation) are routed by the prediction-resolver
from the signal snapshot. Each resolution updates all four buckets
(family, family+regime, symbol+family, symbol+family+regime) so a
bull-trained curve no longer poisons bear emissions.

δ — Fresh per-symbol shards now inherit the family's Platt parameters
via inheritFamilyPriorIntoShard() instead of starting cold. New tokens
skip the multi-day warmup that previously emitted uncalibrated
probabilities for ~48 h after first prediction.

θ — addOutcome takes an optional confidenceWeight and updatePlatt
scales the SGD step by an lrScale derived from |p − 0.5| × 2 + 0.5.
A 75%-confidence wrong call now corrects the calibrator harder than a
55% one — high-conviction misses dominate learning, low-conviction
ones add minimal noise.
H — cross-venue funding aggregator combines Binance, Bybit, and OKX
funding rates (all free public endpoints, 30 s TTL cache). Returns
weighted average plus single-venue divergence (binanceDivergencePct,
maxDivergencePct). Drives funding_divergence_long_skew_bearish /
short_skew_bullish / funding_term_extreme_dispersion_advisory.

K — ETH gas spike detector via Etherscan free tier. Rolls a 5 min
baseline of fast gas, flags isSpike when current ≥ 3× baseline. A
3× spike inside 5 min historically precedes ETH liquidation cascades
by 30–90 min. Drives eth_gas_spike_liquidation_imminent_bearish.

L — BTC mempool fee pulse via mempool.space free API. Rolls a 5 min
baseline of fastestFee, flags isCongested when current ≥ 2× baseline.
Fee acceleration is a 30–90 min leading volatility signal. Drives
btc_mempool_fee_congestion_volatility_imminent_bearish.

All three respect the no-paid-vendor rule (no Nansen / Arkham) and
share a 60 s in-process snapshot cache so concurrent symbol predictions
each cycle issue at most one network call per source.
…+ N + H I J K L M N π σ τ)

N — new volume-profile signal computes a 50-bucket profile over recent
klines, locates the Point of Control (POC), value-area-high/low (70%
volume band), and an insideValueArea flag. Pure compute on existing
Binance klines. Drives volume_profile_above_va_mean_revert_bearish /
below_va_mean_revert_bullish.

ω₂ — engine now neutralizes the SMC vote when smc.vote disagrees with
both smcLastBreakType and smcBias. The 2026-05-22 BTC screenshots all
showed SMC marked bearish · in supply zone while the directional vote
was still bullish — a self-contradiction the engine shouldn't propagate.

MarketContext + FOL rules expanded with the full Round 2 surface:
  • stablecoinNetExpansionUsd → stable_supply_expanding_bullish /
    contracting_bearish (I)
  • altDominancePct / nonBtcEthDominancePct → alt_rotation_confirmed_alt_bullish /
    btc_rotation_dominant_btc_bullish (J)
  • realizedVol* → vol_term_inverted_break_imminent /
    vol_acceleration_post_consolidation (M)
  • crossVenueFunding* / fundingVenueCount → funding_divergence_* (H)
  • ethGas* → eth_gas_spike_liquidation_imminent_bearish (K)
  • btcMempool* → btc_mempool_fee_congestion_volatility_imminent_bearish (L)
  • volumeProfile* → volume_profile_above/below_va_* (N)
  • cvdHourlySlope / cvdMultiTfDivergence → multi_tf_cvd_divergence_*
    bullish_smart_money_accumulation / bearish_smart_money_distribution (π)
  • isInFundingPaymentWindow helper → funding_payment_imminent_long_squeeze_bearish /
    short_squeeze_bullish / just_paid_relief_bullish / capitulation_bearish (σ)
  • ictKillZone session momentum → session_momentum_continuation_bullish /
    bearish (τ)

Engine fetches cross-venue funding / ETH gas / BTC mempool / volume
profile snapshots in the prediction cycle and threads them through
buildMarketContext enrichment. Also passes ictKillZone + hasActiveFlow
into shouldTrackPrediction so ω₁ has the inputs it needs to fire.
Polaris v0.15.1 Round 2.5 — perFamilyMinLogConfidence floor bump.
Production data after Round 2 ship (May 2026, 31 tracked predictions):
tracked decisive WR ~32% with emissions at probability 55–58%. A 55%
prediction cannot resolve at 70% even with perfect calibration —
the math forbids it.

Floors raised: micro 0.45→0.58, intraday 0.40→0.55, swing 0.35→0.52,
macro 0.30→0.50. Volume drops ~40%, surviving emissions resolve
with meaningfully better expected value. Lower back to Round 2 values
once decisive samples per family cross 60 and calibration is trusted
across all 3 regime shards.

Schema default in schema.ts updated to match for fresh-config users.
The Round 1 α bootstrap replayed the most recent 1000 resolved rows
into the calibrator on boot. That count-based bound was wrong-shaped
for two failure modes:
  - Low-volume operator: 6 weeks between deploys may produce <1000
    resolutions and the bootstrap pulled them all, but a fresh-DB
    operator (post `/reset all`) had nothing at all to seed from
  - High-volume operator: 1000 rows might span only 8 hours and
    miss long-tail regime patterns

Replace the LIMIT 1000 clause with a time-windowed pull (default 30
days, configurable via VIZZOR_BOOTSTRAP_WINDOW_DAYS) capped at 5000
rows (VIZZOR_BOOTSTRAP_MAX_ROWS) as a safety bound. The time window
guarantees regime coverage; the row cap prevents thrash on a busy
calibrator. Both env vars override at boot.

Zero behavior change for the common case (mid-volume operator with
<5000 rows in 30 days) — same rows replayed, just selected by
recency window instead of fixed count.
CF algebra combines the 6 signal sources into a single composite score
but loses information about WHICH signals disagreed and by HOW MUCH.
A 0.7 bullish onChain combined with a -0.7 bearish logicRules collapses
to a near-zero composite — the engine then emits as low-conviction
sideways and the calibrator trains on a directionless sample that
carries no learning signal.

Operator MISS pattern 2026-05-22/23: SMC bearish + ICT bullish + RANGE
call — actual move landed outside the band. Two strong signals in
opposite directions don't combine to a tradeable RANGE; they combine
to "we don't know, skip."

Detector scans high-weight non-correlated signal pairs (currently just
onChain × logicRules) and flags conflict when both |CF| >= 0.6 AND
their signs oppose. Engine hook fires before the existing
systemConfirmations/targetEqualsEntry normalize-to-sideways block:
on conflict it forces direction='sideways', collapses probability
toward 0.5–0.6, recomputes forecast as a sideways play, and threads a
`signal_conflict` reasoning line into the trigger snapshot so /predict
🔬 Why and /diagnose surface the downgrade cause.

False-conflict cost: 0 (no skip — the prediction still emits, just as
RANGE). True-conflict gain: removes ~5–10% of the misfile-as-tracked
volume that was poisoning the calibrator.
….5 Gap 3 + Gap 4)

Gap 3 — composite EV gate.
  ω₃ catches R:R < 1.0 and the per-family probability floor catches
  low-confidence calls, but a 56% / R:R 1.05 combo still escapes both
  axes individually while having near-zero expected edge. Approximate
  EV per unit risk as (probability − 0.5) × edgeRatio. Hard-skip when
  composite EV < 0.04 for directional predictions with R:R >= 1.0
  (R:R < 1.0 already caught by ω₃, no double-skip). Knocks out the
  marginal positive-RR but low-edge tail that was hitting the
  calibrator with coin-flip outcomes.

Gap 4 — regime-aware emission throttle.
  The Round 2 ε regime-bucketed calibration LEARNS per regime but does
  not THROTTLE EMISSIONS in a known-bad regime. If the bull-trained
  shard sits at 65% WR and the bear-trained shard at 30% WR, the bot
  should emit fewer predictions in bear, not the same volume.
  - New ConfidenceCalibrator.getHitRate() → cumulative bin-aggregate WR
  - New CalibratorRegistry.getRegimeShardHitRate(family, regime)
  - prediction-qualification.ts gains regimeShardHitRate +
    regimeShardSamples + regimeThrottleRoll inputs; throttle fires
    when shard has >= 20 samples AND hitRate < 0.50 AND uniform random
    roll < 0.5 → 50% emission cut in known-bad regime
  - engine.ts maps currentRegime ∈ {bull, bear, chop} →
    CalibrationRegime ∈ {bull, bear, consolidation}, queries the
    shard's hit rate, and passes a fresh Math.random() per call so
    the throttle is probabilistic (not a hard veto). Tests can
    inject a deterministic roll for repeatability.
Round 2.5 directional gates (per-family floor, composite EV, regime
throttle, dead-session, R:R<1) all return early when direction is
sideways. Range plays therefore flooded through with zero quality
filtering — operator MISS data 2026-05-24 showed a 45m BTC tracked
HIT at `range 50% · 0/3 systems · dead session`, a random-walk play
counted toward the tracked WR denominator.

Adds three sideways-specific floors inside shouldTrackPrediction:

  2.6.1a — probability floor 0.58
    A 50% range call carries no information beyond "we don't know".
    Only emit when the model has at least mild conviction the band
    will hold.

  2.6.1b — signal confirmation floor 2
    Range plays need at least 2/6 supporting signals (typically: low
    vol + mean-revert structure). One signal alone is noise.

  2.6.2 — band-width vs ATR gate
    Skip when the band half-width is less than 0.5× ATR(14). A band
    narrower than half a typical 1-bar range gives the "actual inside"
    outcome ~50/50 odds by random walk — the prediction has no edge.
    Engine computes (rangeHigh − rangeLow) / 2 / atr14 and threads it
    through; undefined when forecast or ATR is missing so the gate
    fails open on degraded input.

Expected impact: ~50% drop in range emission volume, surviving range
plays carry real information (not coin flips), tracked WR denominator
stops being inflated by random-walk HITs.

Note on the original Round 2.6 plan: item 2.6.3 (AccuracyTracker.clear
wired into performFullReset) is moot — AccuracyTracker is 100% DB-
backed with no in-memory cache, so resetPredictionHistory() inside
performFullReset() already zeros it. Dropped from this commit.
The Round 2 cross-venue funding rules ask "is current funding hotter
on Binance than on other venues right now?". The z-score rules ask a
deeper question: "is current funding hotter than this venue's own
30-day distribution?". A z-score > 2 means current funding is in the
97.5th percentile of recent history — leverage is genuinely stretched,
not just nominally positive.

New module src/data/sources/derivatives/funding-history.ts:
  - SQLite table funding_history (venue, symbol, rate, fetched_at)
    + composite index for the lookback query
  - recordFundingHistory(venue, symbol, rate) appends on every
    successful cross-venue fetch (cache-miss path only — cache hits
    would duplicate observations and inflate the sample count)
  - computeFundingZScore(venue, symbol, lookbackDays=30) returns
    { zScore, mean, std, samples, reliable } with lazy pruning of
    rows older than 35d retention
  - computeAllVenueFundingZScores aggregates across Binance + Bybit
    + OKX and exposes maxAbsZScore for extremity flagging

cross-venue-funding.ts extended:
  - Bybit fetch path appends history row on cache-miss
  - OKX fetch path appends history row on cache-miss (instId
    normalized back to BASEUSDT for storage consistency)
  - Binance rate (fetched upstream by engine) recorded inside
    fetchCrossVenueFunding on each aggregation call

MarketContext + FOL rules:
  - fundingZScoreBinance, fundingZScoreMaxAbs, fundingHistorySamples
  - funding_zscore_extreme_long_bearish — fires when binance z > +2,
    samples >= 30, funding > 0 (longs paying extreme premium → SHORT)
  - funding_zscore_extreme_short_bullish — mirror at z < -2

Cost model: ~26k rows/day across 3 venues × 3 symbols × 30s polls,
~780k rows at 30d retention. SQLite handles trivially (<50MB,
<1ms query with the composite index).
Deribit's free public API (no auth) exposes the entire BTC + ETH
options chain in a single request. Per-instrument we get mark IV,
open interest, volume, and the strike/expiry encoded in the
instrument name. From that we compute four institutional-positioning
signals that intraday spot directional models cannot derive from
spot prices alone.

New module src/data/sources/derivatives/deribit-options.ts:
  - parseInstrumentName: BTC-27JUN26-100000-C → side/strike/expiry
  - fetchIndexPrice: independent index price (BTC + ETH supported)
  - fetchBookSummary: full options chain in one call
  - fetchDeribitOptionsSnapshot: computes
      atmIv7d        proximity-weighted ATM IV for 1-7d expiries
      atmIv30d       same shape for 8-30d expiries
      termInversion  true when iv7d > iv30d by >5%
      putCallOiRatio sum(put OI) / sum(call OI)
      otmSkew        IV(OTM puts 5-15%) − IV(OTM calls 5-15%)
                     25-delta skew proxy without Black-Scholes
  - deribitHasOptionsFor: gate helper for currencies Deribit lists
    options on (BTC + ETH currently; data-source limit, not a
    hardcoded symbol allow-list)

MarketContext + 4 FOL rules:
  - optionsAtmIv7d / Iv30d / TermInversion / PutCallOiRatio / OtmSkew
  - options_iv_term_inverted_volatility_imminent (CF 0.25 advisory)
  - options_put_skew_extreme_bearish_reversal_imminent (CF 0.5)
  - options_call_skew_extreme_bullish_top (CF 0.5)
  - options_iv_crush_post_event_relief_bullish (CF 0.35)

Other symbols: ctx fields stay undefined, all four rules skip via
their typeof guards. No degradation, no false fires. 60s cache TTL
sufficient for our prediction cadence.
… Round 3.0 X)

Hyperliquid is the largest fully on-chain perpetual DEX. Its public
metaAndAssetCtxs endpoint exposes funding rate + open interest per
asset with no auth.

Why this matters: sophisticated traders (multi-strat funds, prop
desks) have been migrating size to Hyperliquid for two reasons —
(a) full on-chain transparency for LP reporting, (b) no KYC and no
CEX-counterparty risk. When Hyperliquid OI grows faster than CEX OI,
smart money is positioning ahead of moves the spot crowd hasn't
seen yet. The DEX-vs-CEX funding spread highlights the SIDE that
the smart-money flow is leveraged on — DEX funding hot vs CEX = DEX
longs over-extended → squeeze risk SHORT.

New module src/data/sources/dex/hyperliquid-positions.ts:
  - fetchMetaAndAssetCtxs: POST /info {type:metaAndAssetCtxs}
    returns [universe, ctxs[]] aligned by index
  - fetchHyperliquidPositioning: extracts per-asset funding rate,
    open interest (USD), mark price, and 24h volume; computes
    DEX-vs-CEX funding divergence vs the supplied Binance rate
  - 60s cache TTL keyed by (coin, binanceRate)
  - Symbol-agnostic — consumes whatever Hyperliquid lists; returns
    null cleanly for assets not in HL's universe (no hardcoded
    allow-list)

MarketContext + 2 FOL rules:
  - dexCexFundingDivergencePct, hyperliquidFundingRate,
    hyperliquidOpenInterestUsd
  - hyperliquid_funding_long_skew_bearish — HL funding >20% above
    Binance AND HL funding positive → DEX longs stretched → SHORT (CF 0.5)
  - hyperliquid_funding_short_skew_bullish — mirror, DEX shorts
    stretched → LONG (CF 0.5)

Note on the original plan ("top-wallet positioning"): Hyperliquid
does NOT expose a wallet-leaderboard endpoint in its free public
info API. True wallet-level top-trader tracking would require either
curated wallet lists (hardcoded, out of policy per
feedback_no_hardcoded_symbols.md) or a paid analytics provider. The
DEX-vs-CEX divergence signal captures the same aggregate edge
without per-wallet enumeration.
Two free public retail-flow sources combined into a single contrarian
indicator. Retail enthusiasm marks tops; retail panic marks bottoms.

New module src/data/sources/social/retail-sentiment.ts:
  - Reddit fetcher polls r/cryptocurrency + r/cryptomoonshots
    /new.json (no auth, UA header only). Reddit blocks default
    fetch UAs so the adapter sends a vizzor-branded UA.
  - 4chan fetcher polls /biz/ catalog.json (no auth, simple HTML
    strip on the OP body).
  - Lexicon-based sentiment classifier: 50-word bullish list (moon,
    pump, accumulate, hodl, ...) + 50-word bearish list (dump, rug,
    capitulation, rekt, ...) scored per post.
  - Ticker extraction: $TICKER regex + small spelled-out-name alias
    map. No hardcoded allow-list — emits any ticker that appears.
  - 6h rolling history per ticker for mention-spike z-score
    (in-memory; SQLite persistence deferred to Round 4 if signal
    pays off).
  - retailEuphoriaFlag = spike >3σ AND sentiment > 0.7
  - retailCapitulationFlag = spike >3σ AND sentiment < -0.7
  - 60s poll throttle so concurrent symbol predictions share one
    HTTP fetch per cycle.

MarketContext + 2 contrarian FOL rules:
  - retailMentionSpikeZScore, retailSentimentScore,
    retailEuphoriaFlag, retailCapitulationFlag
  - retail_euphoria_top_warning_bearish (CF 0.45) — spike + bullish
    sentiment → distribution → reversal SHORT
  - retail_capitulation_bottom_warning_bullish (CF 0.45) — spike +
    bearish sentiment → panic → reversal LONG

Contrarian polarity is intentional per operator edge research:
retail enthusiasm has been the single best free-tier top indicator
across multiple cycles. The rules skip cleanly for symbols below
the chatter floor (ctx fields stay undefined).
… defaults

Two source-level hardcoded symbol lists violated the operator's "no
hardcoded coin lists" rule (feedback_no_hardcoded_symbols.md):

  - MAJOR_SYMBOLS in src/data/collector.ts — 23 tokens used as the
    default OHLCV background-ingest list
  - QUICK_SYMBOLS in src/telegram/ui-helpers.ts — 7 tokens rendered
    on the agent-wizard pair selector + cross-command filter rows

Both relocated to config:

  - schema.ts gains `telegram.quickSymbols` (defaults to the legacy
    7-token list) and a new top-level `collector` section
    (`symbols`, `timeframes`, `intervalMs` — defaults preserve the
    legacy MAJOR_SYMBOLS, TIMEFRAMES, COLLECTION_INTERVAL_MS values
    for behavior parity).
  - collector.ts reads via `getCollectorDefaults()` which delegates
    to `getConfig().collector` with a safe pre-config-load fallback.
  - ui-helpers.ts exposes `getQuickSymbols()` reading
    `getConfig().telegram.quickSymbols` at call time (so a YAML edit
    + reload takes effect without restart). The legacy
    `QUICK_SYMBOLS` const stays exported as @deprecated for any
    external import we missed.
  - agent-wizard.ts switched to `getQuickSymbols()`.

Follow-up (deferred to Round 3.1): expose runtime Telegram commands
like /collector-symbols and /quick-symbols so the operator can edit
the lists at runtime without touching YAML — that completes the
spirit of feedback_no_file_edits_for_operator_config.md. This commit
is the necessary structural pre-req.
…rst framing

Tighter copy, less marketing prose, and surface the AI free-text path
prominently — operators were missing that they can predict prices by
just messaging Vizzor in plain English (no slash needed). Both blocks
now end with a "skip the slashes — talk to it" callout + three example
prompts so new users see the chat surface as a first-class entry, not
a hidden footnote.

Section headings shortened (Predict / Scan / Engine + Bots / System)
and per-command descriptions trimmed to the single most useful verb so
the message scans in two seconds. Tone: confident and direct rather
than feature-list prose. MarkdownV2 escapes preserved throughout.
… (Polaris Round 3.1)

Operator bug 2026-05-25: after killing the bot, wiping
chronovisor_predictions + alert_rules, and restarting, OLD DMs still
fire — referencing predictions/alerts that no longer exist. The
operator's mental model is "after restart, only new emissions DM me".

Root cause: the Polaris G3 orphan-row guard in sendTelegramDm
suppresses DMs at ENQUEUE time only. Once a DM lands in
pending_notifications the poller drains it without re-checking the
underlying row. So predictions wiped between bot-down and bot-up
slip past — the queue rows survive the wipe, the poller picks them
up on session start, and they fire orphaned.

Two-layer fix:

Layer 1 — session-scoped expiry on startup.
  - New expirePendingDmsOlderThan(beforeMs) in pending-queue.ts
    marks-delivered every undelivered row with created_at < beforeMs.
  - Poller captures process-start time on init and calls expiry once
    with that timestamp. Anything queued by a previous session that
    survived the restart is discarded instead of spuriously delivered.
  - Cross-process safety: in api+bot split deployments the bot's
    session-start expiry can't kill fresh DMs the api just emitted
    because those rows have created_at > sessionStartMs.

Layer 2 — orphan-row re-guard at delivery.
  - New metadata TEXT column on pending_notifications (idempotent
    ALTER, backward-compatible — legacy rows have null metadata).
  - sendTelegramDm now passes notification.metadata through to
    enqueueTelegramDm.
  - Poller parses metadata before sending: if metadata.predictionId
    refers to a missing chronovisor_predictions row OR
    metadata.ruleId refers to a missing alert_rules row, mark
    delivered (without sending) and skip. Mid-session deletes are
    caught even when the DM was already queued.

Together: restart = clean queue + any mid-session deletes flush
their queued DMs. The operator sees only DMs whose underlying
prediction/alert still exists at delivery time.

No schema migration required — column add is idempotent. 1630 tests
remain green; per-test in-memory DBs get the new column on first
ensureTable() call.
imzzaidd added 11 commits May 25, 2026 07:39
… (Polaris Round 3.2)

Operator bug 2026-05-25 follow-up: after restarting the bot the
operator saw /alerts list 42 system alerts armed by predictions
from previous sessions. Manual SQL wipes that only delete
chronovisor_predictions leave the pred_* alert rules dangling.
On the next session those alerts keep firing against the live
price feed and DM the operator about predictions that no longer
exist (the price-alert-bridge fires, the orphan-row guard
suppresses the DM body but the alert keeps re-arming a
notification every poll).

Adds pruneOrphanedPredictionAlertRules() in notifications/store.ts:
  - select every alert_rule with id starting with 'pred_'
  - strip the 'pred_' prefix and the '_upper'/'_lower'/'_tp1'/'_sl'
    suffix to extract the underlying predictionId
  - cross-check against chronovisor_predictions; collect IDs whose
    prediction row is gone
  - batch DELETE the orphans in a single statement
  - returns the count for the startup log

Wired into bot startup before the pending-DM poller starts so the
prune runs once per process. Manual user alerts (no pred_ prefix)
are left untouched.

Together with Round 3.1's session-scoped DM expiry + delivery-time
orphan re-guard, this closes the operator's "restart still shows
old predictions/alerts" complaint at both the queue surface
(pending_notifications) and the rule surface (alert_rules).
In development the bot tolerates missing API keys (AI chat falls back,
market data still flows via Binance, etc.). In production missing keys
silently break user-facing features without a visible surface — the
operator only learns when users hit them. We enforce hard requirements
at boot so misconfiguration trips the process before users see it.

New `validateProductionConfig()` in loader.ts triggers when either
NODE_ENV='production' OR VIZZOR_REQUIRE_FULL_CONFIG=true (the manual
opt-in for staging / pre-prod checks). Validates:

  - ANTHROPIC_API_KEY required — the AI chat surface depends on it
  - TELEGRAM_BOT_TOKEN required — every operator entry point lives there
  - ML_API_SECRET required when ml.enabled=true — without it the ml
    sidecar accepts requests blindly

On missing values the bot throws at loadConfig() with a clear "missing
required value(s): X, Y" message naming the env vars to set. The check
is skipped in dev (NODE_ENV != production AND no opt-in) so local
development without an API key still works.

Zero behavior change for existing prod operators who already have the
required vars set; the check is opt-in for staging.
…d 3.3 A2)

Vizzor's existing global-error handlers print uncaught exceptions and
unhandled rejections to stderr. In production stderr ends up in docker
logs and never aggregates into "30 of these errors fired in the last
24h" intelligence. Sentry adds the missing aggregation layer with zero
PII (no user inputs, no chat IDs, no wallet addresses in error context).

New src/utils/sentry.ts owns:
  - initSentry(release?) — idempotent boot init gated on SENTRY_DSN
    env var. No-op when unset so dev runs stay quiet.
  - captureError(err, ctx?) — every call site uses this instead of
    importing @sentry/node directly so we can swap providers cleanly.
  - captureMessage(msg, level, ctx?) — for non-Error degradation logs.
  - flushSentry(timeoutMs) — drains queued events before shutdown.

PII scrub: Authorization-style headers (secret|token|key|auth) are
auto-redacted in beforeSend. We never put user inputs or PII into
error contexts ourselves.

Wiring in src/index.ts:
  - initSentry() runs before feature module imports so early-boot
    crashes get captured.
  - unhandledRejection + uncaughtException handlers call captureError
    BEFORE the stderr log. uncaughtException flushes 1.5s before
    process.exit(1) so the event lands in the dashboard.
  - onShutdown('sentry-flush', flushSentry, 'persist') drains the
    queue on SIGTERM/SIGINT before DB closes.

Configuration:
  - SENTRY_DSN — activates Sentry (absent = no-op)
  - SENTRY_TRACES_SAMPLE_RATE — defaults to 0 (no tracing)
  - NODE_ENV — tagged as the environment in Sentry
Cleared the high-impact vulnerabilities flagged in the production
audit:

  before: 6 moderate + 1 high
  after:  2 moderate + 1 high (1 ignored)

Patched:
  - postcss<8.5.10  → XSS via unescaped </style> in CSS stringify
  - brace-expansion → DoS in numeric range expansion
  - ws<8.20.1       → uninitialized memory disclosure in client receive
  - several others in the transitive graph

Remaining advisories sit in deep transitive paths (uuid via jayson via
@solana/web3.js) that we can't patch without forking; the @solana/web3.js
maintainers have a tracking issue. Score is at the floor available
without dropping wallet-chain support.

Verification:
  - pnpm typecheck — clean
  - pnpm test — 1630/1630 passing
…istory (Round 3.3 A4)

The production audit flagged that the canonical queries on three hot
tables had to table-scan for lack of an appropriate index. With even
the modest ~10k-row prediction history the operator's already
accumulated, /precisions, /wr, and the prediction-resolver were doing
unindexed scans per request.

Added (all CREATE INDEX IF NOT EXISTS, idempotent):

  chronovisor_predictions
    idx_predictions_symbol_created    (symbol, created_at DESC)
      per-symbol history queries used by /precisions <sym>, /wr <sym>,
      and the per-symbol panel rolls
    idx_predictions_resolved_at       (resolved_at) WHERE resolved_at IS NOT NULL
      resolved-window aggregations used by AccuracyTracker.getResolvedRecords
      and the calibration bootstrap replay
    idx_predictions_user_symbol       (user_id, symbol) WHERE user_id IS NOT NULL
      per-user filters used by /wr per operator
    idx_predictions_source_resolved   (source, resolved_at)
      forecast vs advisory cohort splits

  alert_rules
    idx_alert_rules_enabled_type      (enabled, type)
      price-alert-bridge scans active rules every poll cycle

  funding_history
    idx_funding_history_fetched_at    (fetched_at)
      standalone so the lazy-prune DELETE WHERE fetched_at < ? becomes
      a range scan instead of a table scan once the table crosses ~100k
      rows. The existing composite index is left-prefix on `venue` so
      SQLite couldn't use it for an unbounded fetched_at filter.

The partial-index WHERE clauses are tried first; older SQLite versions
fall back to the simple index form via the try/catch in
ensureHotPathIndices(). Either way the column index keeps point
lookups fast.

Zero functional behavior change — these are pure planner hints.
… 3.3 A5)

The bot's /health endpoint reports leader-lock state + heartbeat
freshness — useful for uptime monitors but also exposes process
internals (pid, uptime, heartbeat lag) to any caller that can reach
the port. When the deployment binds past 127.0.0.1 (Docker bridge,
k8s service, public load balancer) a leaked endpoint URL exposes
that to the open internet.

Adds opt-in bearer-token gate:

  - VIZZOR_HEALTH_TOKEN env var, when set, requires every request to
    carry `Authorization: Bearer <token>`.
  - Comparison uses constant-time string equality so an attacker can't
    distinguish "wrong length" from "wrong value" via response latency.
  - Returns 401 with a WWW-Authenticate: Bearer challenge on missing /
    bad credentials.
  - When the env var is absent, the endpoint stays open — backward
    compatible with existing operator setups that bind to loopback only.

Recommended for any deployment exposing /health beyond 127.0.0.1:
  - Generate a 32-byte hex token (openssl rand -hex 32)
  - Set VIZZOR_HEALTH_TOKEN=<token> in the prod env
  - Configure the uptime monitor with `Authorization: Bearer <token>`

Zero performance overhead when the env var is unset (one length check).
… (Round 3.3 B1)

The /health endpoint reports liveness but not rate / latency / error
counts. Without those the operator can't distinguish "bot is healthy"
from "bot is healthy but has emitted zero predictions in 4 hours" or
"bot is healthy but every Reddit fetch is timing out". Prometheus is
the standard pull-based surface; Grafana + Alertmanager hook on top.

New src/utils/metrics.ts owns:
  - registry + default process metrics (cpu, mem, gc, event-loop lag,
    vizzor_-prefixed)
  - vizzor_predictions_emitted_total{symbol,horizon,direction,tier}
  - vizzor_predictions_resolved_total{symbol,outcome}
  - vizzor_telegram_dm_delivered_total{type,result}
  - vizzor_data_source_fetch_total{source,result}
  - vizzor_engine_cycle_duration_seconds histogram
  - Small recorders (recordPredictionEmitted, recordTelegramDelivery,
    etc.) so adding a metric never requires importing prom-client
    outside this module — same isolation pattern as utils/sentry.ts.

Gated on VIZZOR_METRICS_PORT env var. When unset the module is a
no-op (zero overhead, no extra port consumed). Recommended prod
config:

  VIZZOR_METRICS_PORT=9090
  VIZZOR_METRICS_HOST=127.0.0.1  (default; expose only via reverse
                                  proxy or scrape from same host)

The scrape endpoint is /metrics; content-type is the prom-client
default text/plain so Prometheus picks it up without per-scrape
config.

Wired in src/telegram/bot.ts next to startBotHealthServer() and
registered with the graceful-shutdown coordinator's 'inbound' phase
so the listener drains cleanly on SIGTERM.

Call sites that should now record (deferred — not in this commit):
  - engine.predict() → recordPredictionEmitted + startEngineCycleTimer
  - prediction-resolver → recordPredictionResolved
  - telegram pending-poller → recordTelegramDelivery (ok/suppressed)
  - new data sources (Deribit / Hyperliquid / Reddit / 4chan /
    cross-venue-funding) → recordDataSourceFetch
Recording at call sites is intentionally split off so the
infrastructure ships first and the recorders can be added per-module
in follow-up commits without churning the whole graph.
…es (Round 3.3 B2)

The four Round 3.0 free-tier adapters (Deribit options, Hyperliquid
positioning, Reddit sentiment, 4chan /biz/) each caught their own
fetch errors and returned null. Fine for one-off failures but pessimal
for two classes:
  - Transient: upstream blips for 1-3s, retry would succeed; we miss
    the data.
  - Persistent: upstream rate-limits us for 10+ minutes; we keep
    hitting it on every cycle, burning request budget.

New src/utils/fetch-resilience.ts owns:
  - resilientFetch(url, source, opts) — fetch with exponential-backoff
    retry on 5xx + network + timeout errors (default 2 retries: 250ms
    + 600ms). 4xx does not retry (caller bug).
  - Per-source circuit breaker — after 5 consecutive failures the
    breaker opens and short-circuits all calls for 60s. Half-open
    probe lets one request through after cool-down to test recovery.
  - Metrics integration — every call emits
    vizzor_data_source_fetch_total{source,result} with result ∈ {ok,
    4xx, 5xx, timeout, network_error, breaker_open, *_retry}.
  - resilientJsonFetch(url, source, opts) — convenience that returns
    null on any failure (matches the existing adapter pattern).

Applied to:
  - deribit-options.ts → sources 'deribit_index', 'deribit_book'
  - hyperliquid-positions.ts → source 'hyperliquid'
  - retail-sentiment.ts → sources 'reddit:cryptocurrency',
    'reddit:cryptomoonshots', '4chan:biz'

The remaining adapters (cross-venue funding Bybit/OKX, eth-gas,
btc-mempool) are intentionally not migrated in this commit — they
already have stable upstreams (Bybit/OKX/etherscan/mempool.space)
and changing them in the same commit increases blast radius without
matching reward. They can adopt the same wrapper in a follow-up.

Zero behavior change when upstreams are healthy. Under degraded
conditions, the operator now sees:
  - Successful retries instead of intermittent gaps
  - The breaker_open counter spiking when an upstream stays down
  - Clean recovery via the half-open probe when it comes back
…Round 3.3 B3 + B4)

Two adjacent production-readiness fixes in docker-compose.yml.

B3 — pin floating :latest image tags:
  - postgres: timescale/timescaledb:latest-pg15 → 2.21.4-pg15
    `latest-pg15` was silently shipping PG15 patch releases AND
    TimescaleDB minor bumps on each compose pull, risking
    unannounced behavior changes (TimescaleDB 2.18 already removed
    deprecated APIs). Pinned to a verified tag; bumps now require
    intentional staging validation + a snapshot backup.
  - n8n: n8nio/n8n:latest → 1.117.4
    Same reason. n8n minor releases have broken workflows multiple
    times via undocumented API shape changes.

B4 — n8n healthcheck:
  Container previously reported "up" even when the internal n8n
  process crashed; depends_on couldn't tell. Added probe against
  the built-in /healthz endpoint that n8n exposes by default.

Bumps from here on require either:
  - A docker compose pull + smoke test in staging, OR
  - A documented rollback path on the prod machine

Pre-prod operators upgrading from main: run `docker compose pull
postgres n8n` AFTER taking a backup. Postgres minor-version
upgrades within the same major are safe; TimescaleDB minor-version
upgrades are usually safe but consult the upstream release notes
before applying in prod.
…ation (Round 3.4)

Operator bug 2026-05-25: after restarting prod and running /reset all,
old predictions and alerts kept firing as DMs. Root cause: every
Vizzor instance on the same machine wrote to the SAME files under
~/.vizzor/ — SQLite DB, config.yaml, wallets, ML state, leader lock,
pending DM queue.

When the operator ran dev (in one terminal) and prod (in another)
side-by-side, the two instances thrashed each other:

  - Dev writes a prediction → it lands in the shared chronovisor_predictions
  - Prod delivers the DM for it (whichever process holds leader-lock)
  - Operator runs /reset all on prod → wipes dev's data too
  - Dev keeps running → re-emits new predictions → operator thinks the
    reset failed because DMs from the "old" set keep arriving (they're
    actually NEW emissions, but indistinguishable to the operator)

  - Both bots fight for the same leader-lock; whichever loses runs in
    viewer mode and goes dark on its UI surface
  - Pending DM queue interleaves between the two instances
  - Alert rules from one instance trigger DMs in the other

Fix: getConfigDir() now honors VIZZOR_DATA_DIR env var when set.

  - Default: ~/.vizzor/  (no migration for existing operators)
  - Prod recommended: VIZZOR_DATA_DIR=/Users/<user>/.vizzor-prod

Both instances now have fully independent SQLite DBs, configs,
wallets, ML state, leader locks, alert rules, pending DM queues. No
shared mutable state.

NOT a complete fix on its own: instances sharing the same
TELEGRAM_BOT_TOKEN still compete for the inbound update stream
(Telegram only delivers each /predict etc. to whoever calls
getUpdates first). For full dev/prod isolation create a separate
Telegram bot via @Botfather for prod and put its token in the prod
config.yaml under VIZZOR_DATA_DIR.

No schema migration. No behavior change for operators who don't set
VIZZOR_DATA_DIR. 1630 tests remain green.
…ompt (Round 3.5)

After the Round 3.0 signal expansion shipped (Deribit options, Hyperliquid
positioning, funding z-score, retail sentiment NLP, multi-TF CVD, volume
profile POC), the operator's chat free-text predictions were still using
only the classic TA + on-chain stack (RSI, MACD, OBV, VWAP, taker, F&G).
The new signals were COMPUTED on every chronovisor call but never named
in the narrative output — the operator had no way to know whether they
fired or contributed.

This commit teaches the chat AI to surface them explicitly. Three edits:

§3.2 Signal Consistency Check — 6 new rows added to the bullish/bearish
classification table covering fundingZScore, optionsTermInversion +
otmSkew, hyperliquidFundingRate, cvdMultiTfDivergence, volumeProfile
POC, and retail euphoria/capitulation flags.

§3.4b Round 3.0 Institutional + Retail Signals — new section. Three
mandatory surfacing rules:
  (a) Single-horizon predictions must name each non-null Round 3.0
      field in the Signal Breakdown section and fold its bias into the
      direction call.
  (b) Multi-horizon / hour-by-hour breakdowns must reference Round 3.0
      state PER BUCKET — Round 3.0 signals carry equal weight to TA.
  (c) Null/warming-up fields are surfaced explicitly ("Round 3.0
      [field]: warming up (need N more samples)" or "n/a for this
      asset (Deribit limitation)").
Includes a 6-glyph visual marker system (institutional / smart money /
retail contrarian / POC magnet / funding window / gas-mempool) and a
concrete example bucket showing how to format the output.

§3.7 Contrarian Indicators — 10 new rows for retailEuphoriaFlag,
retailCapitulationFlag, fundingZScore extremes, options put/call skew
extremes, dexCex funding divergence, and multi-TF CVD divergence —
each tagged with its glyph so the chat output is scannable.

Zero engine changes. The signals were already computed; this commit
unlocks their narrative visibility. Operator can now run "predict SOL
every 30min today" and see Round 3.0 fields per bucket instead of
just classic TA.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant