release(v0.15.1): Polaris — WR diagnostics + per-user accuracy + reset hygiene#44
Open
imzzaidd wants to merge 41 commits into
Open
release(v0.15.1): Polaris — WR diagnostics + per-user accuracy + reset hygiene#44imzzaidd wants to merge 41 commits into
imzzaidd wants to merge 41 commits into
Conversation
normalized table holding one row per (resolved prediction, contributing signal). enables fast per-signal decisive WR queries without parsing signal_snapshot JSON on every read. foundation for the polaris signal auditor and polarity registry — a signal whose hit rate sits chronically below 0.40 over 30+ samples is a sign-inversion candidate.
polaris diagnostics layer. enables per-signal WR-driven sign inversion and dampening without retraining. - signal-auditor: classifies each signal as INVERT/DAMPEN/WATCH/KEEP based on rolling decisive WR (threshold <0.40 over 30+ samples → inversion candidate). - signal-polarity: runtime registry consulted by engine before composite assembly; supports auto (auditor-driven) and manual (operator-pinned) overrides. Persisted to signal_polarity sqlite table. - realized-priors: empirical bayesian prior derived from the recent realized-direction distribution per family. falls back to static familyAwarePrior when the sample is too small to trust. - /wr-diagnose telegram command: per-signal WR breakdown with inversion / dampen / watch sections + active overrides. - /signal-override telegram command: operator-pinned polarity wins over auto auditor. - pnpm wr:audit cli: signal-level WR audit; exits non-zero when inversion candidates exist so CI can flag.
polaris predictiveness layer — engine now anticipates moves via leading indicators instead of reacting to lagging structure. - leadingFlow signal aggregator: taker momentum, orderbook imbalance, whale flow, CVD slope, funding inflection. injected with 30% weight at the FRONT of the onchain composite via new sigCfg.onChain. leadingFlowWeight. - cvd-gated SMC vote: when SMC reads bullish demand but cvd's recent slope is bearish (demand is being eaten), neutralize the structural vote instead of letting it outvote real-time order-flow truth. - 8 reversal FOL rules: failed_support_break, failed_resistance_rejection, asian_range_continuation (bull/bear), funding_peak_reversal_bearish, funding_trough_reversal_bullish, sweep_and_reverse_bullish_trap, sweep_and_reverse_bearish_trap. - 4 structural FOL rules: stable_supply_expanding_bullish + contracting_ bearish, alt_rotation_confirmed_alt_bullish, btc_rotation_dominant_btc_ bullish (TOTAL2/TOTAL3 dominance algebra). - 2 volatility FOL rules: vol_term_inverted_break_imminent, vol_acceleration_post_consolidation (realized vol + term structure). - realized-volatility signal: short (12h) vs long (48h) realized σ + acceleration, pure compute on existing OHLCV. - MarketContext gains spotPrice, stablecoinNetExpansionUsd, altDominancePct, nonBtcEthDominancePct, realizedVol1h/4h, volTermSlope, volAcceleration.
Polaris D + F. Post-processes the predicted forecast range so the band edges absorb two predictable wick patterns the prior grader counted as MISS: - applyLiquidationMagnet: when a significant long/short liquidation cluster sits beyond the band edge on the matching side, stretch the edge to (cluster ± small buffer) capped at 2x the original half- width. Symbol-agnostic — works off liquidationProximity data the engine already gathers, no hardcoded ticker list. - applyMomentumBandWidth: when taker momentum + orderbook imbalance derivative + CVD short slope stack one way with magnitude >= 0.25, widen the band on that side by up to 1.8x while modestly widening the other edge. Captures the BTC 30m MISS pattern where bearish momentum stacked but the band was sized for a quiet hold. Both helpers run AFTER computePredictionForecast closes, on a shallow copy of the forecast, so callers can chain them without ownership concerns. Pure pricing math — no scoring side effects. Includes the placement fix in engine.ts: the two import-and-apply blocks live AFTER the forecast finalization, not inside the composite pipeline.
Operator bug report 2026-05-21: the per-user filters in /precisions, /wr, and /predictions were collapsing to "show every row across all users". The columns existed and the read paths filtered correctly — but the write path never populated user_id, so every row was logged as NULL and the OR-clause widened the result to everything in the database. This commit closes the loop end-to-end: - accuracy-tracker.logPrediction: INSERT now includes user_id and reads PredictionRecord.userId. Migration adds the column for fresh DBs / unit tests that construct AccuracyTracker ahead of the multi-user migration run. - PredictionRecord gains an optional userId field with full rationale in the doc comment. - engine.predict() gains an options.userId that flows through to every logPrediction call site (forecast, directional with TP/SL, forecast resolve). - All Telegram and AI tool-call sites that originate from an authenticated user thread userId through: - /predict, /diagnose, /polymarket commands - get_prediction, predict_price, polymarket_edge tools - bulk-ladder predict calls - New regression test verifies that a per-user predict call writes the user_id column. Scheduler broadcasts and CLI/TUI invocations leave userId undefined, so the row is NULL and the OR-clause keeps shared broadcasts visible to every watchlist-matching subscriber.
Operator-requested trader-channel format. One compact card per horizon, MarkdownV2-escaped, with branded glyphs for known assets and a clean fallback for everything else. Cards in two shapes: ₿ BITCOIN · 30m 💰 BTC Price: $77,733.83 💵 Direction: ➖ RANGE (55%) 🪙 Band: $77,656.09 — $77,811.56 💹 Best Play: Range fade — long $77,656.09, short $77,811.56 (no leverage) 🟣 SOLANA · 30m 💰 SOL Price: $87.40 💵 Direction: 📉 SHORT (61%) 🪙 Entry Zone: $87.40 — $87.44 📈 TP1: $87.29 (-0.13%) 📊 SL: $87.54 (+0.16%) ⚠ Skip: R:R 1:0.78 — risk exceeds reward, no trade Multi-horizon replies separate cards with a horizontal rule and surface a "Multi-horizon consensus" chip when per-horizon directions disagree, so the operator sees incoherent calls before placing a trade. Files: - src/telegram/render-predict-card.ts: pure renderer - src/telegram/symbol-display.ts: glyph + name registry. Built-in table for the most common assets, with operator extension via chronovisor.symbolDisplay.<TICKER> in config. Unknown symbols fall back to a generic icon + raw ticker — features still apply universally (no symbol-list gating). - Config schema gains the optional symbolDisplay record. - /predict and the predict-keyboard callback both swap from renderChronoVisorResult to renderPredictCard.
Exposes the signal auditor as `pnpm wr:audit` so it can run from CI or local one-shot diagnostics. The script (scripts/wr-audit.ts) landed in the earlier polaris diagnostics commit; this entry just makes it discoverable through the package.json scripts surface. The audit exits non-zero when any active signal sits below the INVERT threshold (decisive WR < 0.40 over 30+ samples), so CI can gate releases on per-signal calibration health. The companion /wr-diagnose and /signal-override Telegram commands were registered in the polarity-registry commit; no further wiring is needed here.
…dm guard Operator bug report 2026-05-22: /reset all and manual sqlite DELETE runs left stale state behind that re-emerged as ghost predictions in the next cycle. Three causes, fixed in one atomic operation: 1) Polaris-era tables survived the wipe. signal_outcomes, signal_polarity, calibration_family, calibration_symbol, rule_stats, volatility, weights, and patterns all needed explicit DELETE statements that the legacy resetPredictionHistory path didn't know about. 2) In-memory registries kept their snapshots. CalibratorRegistry, RuleTracker, VolatilityRegistry, SignalPolarityRegistry, the pattern library, and warmup-state's cached counts all held a live view of the pre-wipe state — which the next debounced ml-state save then wrote BACK to DB, undoing the wipe. 3) In-flight resolved-DMs from prior pending_notifications kept landing after the wipe completed, making it look like predictions had resurrected. This commit closes all three: - New full-reset module wraps the wipe atomically: every prediction- related table, every alert rule, every pending notification. - ChronoVisorEngine.getResetCallbacks() exposes named clear() functions for every in-process registry the engine owns. WarmupState.reset() + RuleTracker.clear() were added so the rule-stat / cached-count surfaces can be flushed. - Telegram outbound channel adds an orphan-row guard: before delivering a prediction_resolved or price_threshold DM it checks that the underlying row still exists. If not, the DM is suppressed with a debug log — eliminating the post-wipe ghost bubble. The resolver writes predictionId into metadata so the guard has a key to look up. - /reset all reply gains a visible "VIZZOR STATS RESET" divider block so the chat scrollback has an unambiguous "everything above is stale" marker (Telegram permanently keeps history).
Polaris α + β + γ. Three independent accelerators that reduce the
real-world sample budget needed to exit the warmup-conservatism
regime and engage full-strength prediction.
α — Calibrator bootstrap from history
loadMlState() now replays up to 1000 most-recent resolved
predictions through the calibrator + rule tracker on boot when
the persistence layer comes back empty. A fresh DB or post-/reset
start that already has historical resolutions in
chronovisor_predictions exits warmup in seconds instead of waiting
30+ live resolutions. Idempotent: a non-empty
chronovisor_calibration_family table short-circuits the bootstrap.
β — Graduated warmup multiplier
Original behaviour was binary: below minSamples the cap was applied
at full strength; at minSamples it dropped entirely. This produced
an "11-to-12 sample cliff" where one lucky resolution unlocked
full-strength confidence. The graduated relaxation now scales the
cap with sqrt(samples / minSamples):
0 samples → 0.0 (full cap applied)
3 samples → 0.5 (50% applied)
12 samples → 1.0 (cap removed)
applyProbabilityCap and applyConfidenceCap both consume the
multiplier so probability and confidence climb smoothly.
γ — Graduated outcome labels for Platt scaling
ConfidenceCalibrator.addOutcome now accepts a number ∈ [0,1] in
addition to the boolean wasCorrect. A graduated resolution score
(0.5 = neutral, 0.65 = near-miss, 1.0 = full hit) feeds the SGD
update directly, nudging the curve proportionally instead of as
binary 0/1 corrections. Empirically converges the Platt curve in
roughly half the sample count for the same MAE. The histogram
bin still uses the 0/1 convention (score >= 0.5 counts as a hit)
so calibration plots remain interpretable.
Per-family configuration
Production data (May 2026) showed scalp-15m WR at 31.9% and
intraday-1h WR at 14.5%. Per-family emission floors are now
configurable via chronovisor.perFamilyMinLogConfidence
(micro=0.45, intraday=0.40, swing=0.35, macro=0.30) and the
engine reads through getMinLogConfidenceForFamily so the floor
is applied at qualification time.
Sideways dead zone widened from ±0.005 to ±0.04 around 0.5 — the
razor-thin band was emitting directional calls at probabilities
barely distinguishable from a coin flip. Widening to 0.46/0.54
forces uncertain calls to file as `sideways` (RANGE in the
renderer) which is what they always were.
New wrWatchdog config drives a rolling-WR warning notification
in the resolver: when the recent decisive WR falls below
warnBelow over windowSize samples, an info DM names the worst-
performing signal so the operator can pause or override it.
Polarity reconciliation cadence
The existing CALIBRATION_RETRAIN_INTERVAL now also triggers
signal-polarity reconciliation from the auditor, so a freshly-
seen sign error is corrected within ~20 resolutions of becoming
statistically defensible. Manual overrides remain untouched.
Per-prediction signal_outcomes
Resolutions write per-signal vote/outcome rows to the
signal_outcomes table the diagnostics layer landed earlier,
feeding the auditor, /wr-diagnose, and the polarity registry.
Best-effort: a write failure must never block the resolution
path.
…NGE gate label Operator MISS data 2026-05-21/22: every BTC directional and RANGE MISS in the screenshots had Gate=0/3 or 1/3 with cycleHealth in either `degraded` or `unsafe`. The qualification gate downgraded these to advisory tier but the resolver still surfaced them in chat as "Tier: Tracked" via the resolved-DM path, so they still polluted the visible WR. shouldTrackPrediction now hard-skips directional predictions when cycleHealth.status is `degraded` or `unsafe` AND systemConfirmations is below 3, returning `track: false` with state `skip` so they never enter the resolution pipeline at all. The eligibility check is gated on `!warmingUp` so cold-start predictions still flow through the relaxed-floor warmup path. Companion fix in render-crypto-signal: the "🚦 Gate: 0/3 — SKIP" label was misleading for RANGE plays. systemConfirmations is 0 by definition on a sideways forecast (no direction to confirm), so the operator was reading SKIP on a tracked-tier RANGE play and assuming the engine had ignored the gate. RANGE plays are graded on the band, not on system votes, so the renderer now emits "🚦 Gate: RANGE play (3-system gate applies to directional only)".
Polaris patch release. Bundles the per-user WR fix, the /reset ghost-DM elimination, the /predict card renderer, graduated warmup acceleration, cycle-degraded hard-skip, and the band-deformation helpers into a single patch release on top of 0.15.0. - root package: 0.15.0 → 0.15.1 - web/package: 0.15.0 → 0.15.1 - web/lib/constants APP_VERSION fallback: 0.15.0 → 0.15.1 - test/unit/utils/package.test.ts now reads the canonical version from package.json instead of hard-coding it, so the assertion no longer goes stale on every patch bump.
ω₁ — dead ICT session + no active flow + thin confirmations now routes to state='skip' instead of advisory. The 2026-05-22/23 BTC RANGE MISSes all fired during dead session with 1/3 systems agreeing. ω₃ — universal R:R<1.0 hard-skip across all tracking profiles. Balanced mode previously tolerated negative-EV trades as 'Tentative'; operator MISS data (SOL 45m SHORT R:R 1:0.67 → +2.04% opposite) confirms these should never reach chat. Updates the balanced-mode edge_too_thin test to match the new policy: R:R 0.3 now asserts track=false / state='skip' / reason starts with rr_below_one_negative_ev.
ω₄ — /precisions and /wr render "— (n/m, warming up)" until a calibrator has at least 10 decisive resolutions. Stops the bot from flashing 14.5% / 22% WR figures that are statistical noise from a handful of early samples. ω₆ — /predict card now includes a "🔬 Why" section per emission with: • per-signal CF contributions (which signals fired, which agree) • triple-system status (Vizzor TA + SMC + ICT bias / kill-zone) • guardrails snapshot (gate result, R:R, ATR sanity) • short reasoning paragraph renderPredictCard now takes an includeWhy option, on by default in the /predict command path.
ω₅ — MomentumBandInputs now accepts volTermSlope + volAcceleration. composeMomentumDirection applies a volMagnitudeLift derived from realized-vol features so the band widens when 1h realized σ is rising into post-consolidation breakouts. Direction is untouched — only band magnitude reacts (vol is a magnitude indicator, not a directional one). π — CVD signal now fetches 5m and 1h klines in parallel and exposes hourlySlope + multiTfDivergence. A 5m-bullish / 1h-bearish split is classified as bearish_multi_tf (retail pump vs smart-money distribution) and the mirror as bullish_multi_tf. Drives the new multi_tf_cvd_divergence_* FOL rules.
…δ θ) ε — Platt calibrator now keeps per-(family, regime) shards in addition to the existing per-family and per-(symbol, family) shards. Regimes (bull / bear / consolidation) are routed by the prediction-resolver from the signal snapshot. Each resolution updates all four buckets (family, family+regime, symbol+family, symbol+family+regime) so a bull-trained curve no longer poisons bear emissions. δ — Fresh per-symbol shards now inherit the family's Platt parameters via inheritFamilyPriorIntoShard() instead of starting cold. New tokens skip the multi-day warmup that previously emitted uncalibrated probabilities for ~48 h after first prediction. θ — addOutcome takes an optional confidenceWeight and updatePlatt scales the SGD step by an lrScale derived from |p − 0.5| × 2 + 0.5. A 75%-confidence wrong call now corrects the calibrator harder than a 55% one — high-conviction misses dominate learning, low-conviction ones add minimal noise.
H — cross-venue funding aggregator combines Binance, Bybit, and OKX funding rates (all free public endpoints, 30 s TTL cache). Returns weighted average plus single-venue divergence (binanceDivergencePct, maxDivergencePct). Drives funding_divergence_long_skew_bearish / short_skew_bullish / funding_term_extreme_dispersion_advisory. K — ETH gas spike detector via Etherscan free tier. Rolls a 5 min baseline of fast gas, flags isSpike when current ≥ 3× baseline. A 3× spike inside 5 min historically precedes ETH liquidation cascades by 30–90 min. Drives eth_gas_spike_liquidation_imminent_bearish. L — BTC mempool fee pulse via mempool.space free API. Rolls a 5 min baseline of fastestFee, flags isCongested when current ≥ 2× baseline. Fee acceleration is a 30–90 min leading volatility signal. Drives btc_mempool_fee_congestion_volatility_imminent_bearish. All three respect the no-paid-vendor rule (no Nansen / Arkham) and share a 60 s in-process snapshot cache so concurrent symbol predictions each cycle issue at most one network call per source.
…+ N + H I J K L M N π σ τ)
N — new volume-profile signal computes a 50-bucket profile over recent
klines, locates the Point of Control (POC), value-area-high/low (70%
volume band), and an insideValueArea flag. Pure compute on existing
Binance klines. Drives volume_profile_above_va_mean_revert_bearish /
below_va_mean_revert_bullish.
ω₂ — engine now neutralizes the SMC vote when smc.vote disagrees with
both smcLastBreakType and smcBias. The 2026-05-22 BTC screenshots all
showed SMC marked bearish · in supply zone while the directional vote
was still bullish — a self-contradiction the engine shouldn't propagate.
MarketContext + FOL rules expanded with the full Round 2 surface:
• stablecoinNetExpansionUsd → stable_supply_expanding_bullish /
contracting_bearish (I)
• altDominancePct / nonBtcEthDominancePct → alt_rotation_confirmed_alt_bullish /
btc_rotation_dominant_btc_bullish (J)
• realizedVol* → vol_term_inverted_break_imminent /
vol_acceleration_post_consolidation (M)
• crossVenueFunding* / fundingVenueCount → funding_divergence_* (H)
• ethGas* → eth_gas_spike_liquidation_imminent_bearish (K)
• btcMempool* → btc_mempool_fee_congestion_volatility_imminent_bearish (L)
• volumeProfile* → volume_profile_above/below_va_* (N)
• cvdHourlySlope / cvdMultiTfDivergence → multi_tf_cvd_divergence_*
bullish_smart_money_accumulation / bearish_smart_money_distribution (π)
• isInFundingPaymentWindow helper → funding_payment_imminent_long_squeeze_bearish /
short_squeeze_bullish / just_paid_relief_bullish / capitulation_bearish (σ)
• ictKillZone session momentum → session_momentum_continuation_bullish /
bearish (τ)
Engine fetches cross-venue funding / ETH gas / BTC mempool / volume
profile snapshots in the prediction cycle and threads them through
buildMarketContext enrichment. Also passes ictKillZone + hasActiveFlow
into shouldTrackPrediction so ω₁ has the inputs it needs to fire.
Polaris v0.15.1 Round 2.5 — perFamilyMinLogConfidence floor bump. Production data after Round 2 ship (May 2026, 31 tracked predictions): tracked decisive WR ~32% with emissions at probability 55–58%. A 55% prediction cannot resolve at 70% even with perfect calibration — the math forbids it. Floors raised: micro 0.45→0.58, intraday 0.40→0.55, swing 0.35→0.52, macro 0.30→0.50. Volume drops ~40%, surviving emissions resolve with meaningfully better expected value. Lower back to Round 2 values once decisive samples per family cross 60 and calibration is trusted across all 3 regime shards. Schema default in schema.ts updated to match for fresh-config users.
The Round 1 α bootstrap replayed the most recent 1000 resolved rows
into the calibrator on boot. That count-based bound was wrong-shaped
for two failure modes:
- Low-volume operator: 6 weeks between deploys may produce <1000
resolutions and the bootstrap pulled them all, but a fresh-DB
operator (post `/reset all`) had nothing at all to seed from
- High-volume operator: 1000 rows might span only 8 hours and
miss long-tail regime patterns
Replace the LIMIT 1000 clause with a time-windowed pull (default 30
days, configurable via VIZZOR_BOOTSTRAP_WINDOW_DAYS) capped at 5000
rows (VIZZOR_BOOTSTRAP_MAX_ROWS) as a safety bound. The time window
guarantees regime coverage; the row cap prevents thrash on a busy
calibrator. Both env vars override at boot.
Zero behavior change for the common case (mid-volume operator with
<5000 rows in 30 days) — same rows replayed, just selected by
recency window instead of fixed count.
CF algebra combines the 6 signal sources into a single composite score but loses information about WHICH signals disagreed and by HOW MUCH. A 0.7 bullish onChain combined with a -0.7 bearish logicRules collapses to a near-zero composite — the engine then emits as low-conviction sideways and the calibrator trains on a directionless sample that carries no learning signal. Operator MISS pattern 2026-05-22/23: SMC bearish + ICT bullish + RANGE call — actual move landed outside the band. Two strong signals in opposite directions don't combine to a tradeable RANGE; they combine to "we don't know, skip." Detector scans high-weight non-correlated signal pairs (currently just onChain × logicRules) and flags conflict when both |CF| >= 0.6 AND their signs oppose. Engine hook fires before the existing systemConfirmations/targetEqualsEntry normalize-to-sideways block: on conflict it forces direction='sideways', collapses probability toward 0.5–0.6, recomputes forecast as a sideways play, and threads a `signal_conflict` reasoning line into the trigger snapshot so /predict 🔬 Why and /diagnose surface the downgrade cause. False-conflict cost: 0 (no skip — the prediction still emits, just as RANGE). True-conflict gain: removes ~5–10% of the misfile-as-tracked volume that was poisoning the calibrator.
….5 Gap 3 + Gap 4)
Gap 3 — composite EV gate.
ω₃ catches R:R < 1.0 and the per-family probability floor catches
low-confidence calls, but a 56% / R:R 1.05 combo still escapes both
axes individually while having near-zero expected edge. Approximate
EV per unit risk as (probability − 0.5) × edgeRatio. Hard-skip when
composite EV < 0.04 for directional predictions with R:R >= 1.0
(R:R < 1.0 already caught by ω₃, no double-skip). Knocks out the
marginal positive-RR but low-edge tail that was hitting the
calibrator with coin-flip outcomes.
Gap 4 — regime-aware emission throttle.
The Round 2 ε regime-bucketed calibration LEARNS per regime but does
not THROTTLE EMISSIONS in a known-bad regime. If the bull-trained
shard sits at 65% WR and the bear-trained shard at 30% WR, the bot
should emit fewer predictions in bear, not the same volume.
- New ConfidenceCalibrator.getHitRate() → cumulative bin-aggregate WR
- New CalibratorRegistry.getRegimeShardHitRate(family, regime)
- prediction-qualification.ts gains regimeShardHitRate +
regimeShardSamples + regimeThrottleRoll inputs; throttle fires
when shard has >= 20 samples AND hitRate < 0.50 AND uniform random
roll < 0.5 → 50% emission cut in known-bad regime
- engine.ts maps currentRegime ∈ {bull, bear, chop} →
CalibrationRegime ∈ {bull, bear, consolidation}, queries the
shard's hit rate, and passes a fresh Math.random() per call so
the throttle is probabilistic (not a hard veto). Tests can
inject a deterministic roll for repeatability.
Round 2.5 directional gates (per-family floor, composite EV, regime
throttle, dead-session, R:R<1) all return early when direction is
sideways. Range plays therefore flooded through with zero quality
filtering — operator MISS data 2026-05-24 showed a 45m BTC tracked
HIT at `range 50% · 0/3 systems · dead session`, a random-walk play
counted toward the tracked WR denominator.
Adds three sideways-specific floors inside shouldTrackPrediction:
2.6.1a — probability floor 0.58
A 50% range call carries no information beyond "we don't know".
Only emit when the model has at least mild conviction the band
will hold.
2.6.1b — signal confirmation floor 2
Range plays need at least 2/6 supporting signals (typically: low
vol + mean-revert structure). One signal alone is noise.
2.6.2 — band-width vs ATR gate
Skip when the band half-width is less than 0.5× ATR(14). A band
narrower than half a typical 1-bar range gives the "actual inside"
outcome ~50/50 odds by random walk — the prediction has no edge.
Engine computes (rangeHigh − rangeLow) / 2 / atr14 and threads it
through; undefined when forecast or ATR is missing so the gate
fails open on degraded input.
Expected impact: ~50% drop in range emission volume, surviving range
plays carry real information (not coin flips), tracked WR denominator
stops being inflated by random-walk HITs.
Note on the original Round 2.6 plan: item 2.6.3 (AccuracyTracker.clear
wired into performFullReset) is moot — AccuracyTracker is 100% DB-
backed with no in-memory cache, so resetPredictionHistory() inside
performFullReset() already zeros it. Dropped from this commit.
The Round 2 cross-venue funding rules ask "is current funding hotter
on Binance than on other venues right now?". The z-score rules ask a
deeper question: "is current funding hotter than this venue's own
30-day distribution?". A z-score > 2 means current funding is in the
97.5th percentile of recent history — leverage is genuinely stretched,
not just nominally positive.
New module src/data/sources/derivatives/funding-history.ts:
- SQLite table funding_history (venue, symbol, rate, fetched_at)
+ composite index for the lookback query
- recordFundingHistory(venue, symbol, rate) appends on every
successful cross-venue fetch (cache-miss path only — cache hits
would duplicate observations and inflate the sample count)
- computeFundingZScore(venue, symbol, lookbackDays=30) returns
{ zScore, mean, std, samples, reliable } with lazy pruning of
rows older than 35d retention
- computeAllVenueFundingZScores aggregates across Binance + Bybit
+ OKX and exposes maxAbsZScore for extremity flagging
cross-venue-funding.ts extended:
- Bybit fetch path appends history row on cache-miss
- OKX fetch path appends history row on cache-miss (instId
normalized back to BASEUSDT for storage consistency)
- Binance rate (fetched upstream by engine) recorded inside
fetchCrossVenueFunding on each aggregation call
MarketContext + FOL rules:
- fundingZScoreBinance, fundingZScoreMaxAbs, fundingHistorySamples
- funding_zscore_extreme_long_bearish — fires when binance z > +2,
samples >= 30, funding > 0 (longs paying extreme premium → SHORT)
- funding_zscore_extreme_short_bullish — mirror at z < -2
Cost model: ~26k rows/day across 3 venues × 3 symbols × 30s polls,
~780k rows at 30d retention. SQLite handles trivially (<50MB,
<1ms query with the composite index).
Deribit's free public API (no auth) exposes the entire BTC + ETH
options chain in a single request. Per-instrument we get mark IV,
open interest, volume, and the strike/expiry encoded in the
instrument name. From that we compute four institutional-positioning
signals that intraday spot directional models cannot derive from
spot prices alone.
New module src/data/sources/derivatives/deribit-options.ts:
- parseInstrumentName: BTC-27JUN26-100000-C → side/strike/expiry
- fetchIndexPrice: independent index price (BTC + ETH supported)
- fetchBookSummary: full options chain in one call
- fetchDeribitOptionsSnapshot: computes
atmIv7d proximity-weighted ATM IV for 1-7d expiries
atmIv30d same shape for 8-30d expiries
termInversion true when iv7d > iv30d by >5%
putCallOiRatio sum(put OI) / sum(call OI)
otmSkew IV(OTM puts 5-15%) − IV(OTM calls 5-15%)
25-delta skew proxy without Black-Scholes
- deribitHasOptionsFor: gate helper for currencies Deribit lists
options on (BTC + ETH currently; data-source limit, not a
hardcoded symbol allow-list)
MarketContext + 4 FOL rules:
- optionsAtmIv7d / Iv30d / TermInversion / PutCallOiRatio / OtmSkew
- options_iv_term_inverted_volatility_imminent (CF 0.25 advisory)
- options_put_skew_extreme_bearish_reversal_imminent (CF 0.5)
- options_call_skew_extreme_bullish_top (CF 0.5)
- options_iv_crush_post_event_relief_bullish (CF 0.35)
Other symbols: ctx fields stay undefined, all four rules skip via
their typeof guards. No degradation, no false fires. 60s cache TTL
sufficient for our prediction cadence.
… Round 3.0 X)
Hyperliquid is the largest fully on-chain perpetual DEX. Its public
metaAndAssetCtxs endpoint exposes funding rate + open interest per
asset with no auth.
Why this matters: sophisticated traders (multi-strat funds, prop
desks) have been migrating size to Hyperliquid for two reasons —
(a) full on-chain transparency for LP reporting, (b) no KYC and no
CEX-counterparty risk. When Hyperliquid OI grows faster than CEX OI,
smart money is positioning ahead of moves the spot crowd hasn't
seen yet. The DEX-vs-CEX funding spread highlights the SIDE that
the smart-money flow is leveraged on — DEX funding hot vs CEX = DEX
longs over-extended → squeeze risk SHORT.
New module src/data/sources/dex/hyperliquid-positions.ts:
- fetchMetaAndAssetCtxs: POST /info {type:metaAndAssetCtxs}
returns [universe, ctxs[]] aligned by index
- fetchHyperliquidPositioning: extracts per-asset funding rate,
open interest (USD), mark price, and 24h volume; computes
DEX-vs-CEX funding divergence vs the supplied Binance rate
- 60s cache TTL keyed by (coin, binanceRate)
- Symbol-agnostic — consumes whatever Hyperliquid lists; returns
null cleanly for assets not in HL's universe (no hardcoded
allow-list)
MarketContext + 2 FOL rules:
- dexCexFundingDivergencePct, hyperliquidFundingRate,
hyperliquidOpenInterestUsd
- hyperliquid_funding_long_skew_bearish — HL funding >20% above
Binance AND HL funding positive → DEX longs stretched → SHORT (CF 0.5)
- hyperliquid_funding_short_skew_bullish — mirror, DEX shorts
stretched → LONG (CF 0.5)
Note on the original plan ("top-wallet positioning"): Hyperliquid
does NOT expose a wallet-leaderboard endpoint in its free public
info API. True wallet-level top-trader tracking would require either
curated wallet lists (hardcoded, out of policy per
feedback_no_hardcoded_symbols.md) or a paid analytics provider. The
DEX-vs-CEX divergence signal captures the same aggregate edge
without per-wallet enumeration.
Two free public retail-flow sources combined into a single contrarian
indicator. Retail enthusiasm marks tops; retail panic marks bottoms.
New module src/data/sources/social/retail-sentiment.ts:
- Reddit fetcher polls r/cryptocurrency + r/cryptomoonshots
/new.json (no auth, UA header only). Reddit blocks default
fetch UAs so the adapter sends a vizzor-branded UA.
- 4chan fetcher polls /biz/ catalog.json (no auth, simple HTML
strip on the OP body).
- Lexicon-based sentiment classifier: 50-word bullish list (moon,
pump, accumulate, hodl, ...) + 50-word bearish list (dump, rug,
capitulation, rekt, ...) scored per post.
- Ticker extraction: $TICKER regex + small spelled-out-name alias
map. No hardcoded allow-list — emits any ticker that appears.
- 6h rolling history per ticker for mention-spike z-score
(in-memory; SQLite persistence deferred to Round 4 if signal
pays off).
- retailEuphoriaFlag = spike >3σ AND sentiment > 0.7
- retailCapitulationFlag = spike >3σ AND sentiment < -0.7
- 60s poll throttle so concurrent symbol predictions share one
HTTP fetch per cycle.
MarketContext + 2 contrarian FOL rules:
- retailMentionSpikeZScore, retailSentimentScore,
retailEuphoriaFlag, retailCapitulationFlag
- retail_euphoria_top_warning_bearish (CF 0.45) — spike + bullish
sentiment → distribution → reversal SHORT
- retail_capitulation_bottom_warning_bullish (CF 0.45) — spike +
bearish sentiment → panic → reversal LONG
Contrarian polarity is intentional per operator edge research:
retail enthusiasm has been the single best free-tier top indicator
across multiple cycles. The rules skip cleanly for symbols below
the chatter floor (ctx fields stay undefined).
… defaults
Two source-level hardcoded symbol lists violated the operator's "no
hardcoded coin lists" rule (feedback_no_hardcoded_symbols.md):
- MAJOR_SYMBOLS in src/data/collector.ts — 23 tokens used as the
default OHLCV background-ingest list
- QUICK_SYMBOLS in src/telegram/ui-helpers.ts — 7 tokens rendered
on the agent-wizard pair selector + cross-command filter rows
Both relocated to config:
- schema.ts gains `telegram.quickSymbols` (defaults to the legacy
7-token list) and a new top-level `collector` section
(`symbols`, `timeframes`, `intervalMs` — defaults preserve the
legacy MAJOR_SYMBOLS, TIMEFRAMES, COLLECTION_INTERVAL_MS values
for behavior parity).
- collector.ts reads via `getCollectorDefaults()` which delegates
to `getConfig().collector` with a safe pre-config-load fallback.
- ui-helpers.ts exposes `getQuickSymbols()` reading
`getConfig().telegram.quickSymbols` at call time (so a YAML edit
+ reload takes effect without restart). The legacy
`QUICK_SYMBOLS` const stays exported as @deprecated for any
external import we missed.
- agent-wizard.ts switched to `getQuickSymbols()`.
Follow-up (deferred to Round 3.1): expose runtime Telegram commands
like /collector-symbols and /quick-symbols so the operator can edit
the lists at runtime without touching YAML — that completes the
spirit of feedback_no_file_edits_for_operator_config.md. This commit
is the necessary structural pre-req.
…rst framing Tighter copy, less marketing prose, and surface the AI free-text path prominently — operators were missing that they can predict prices by just messaging Vizzor in plain English (no slash needed). Both blocks now end with a "skip the slashes — talk to it" callout + three example prompts so new users see the chat surface as a first-class entry, not a hidden footnote. Section headings shortened (Predict / Scan / Engine + Bots / System) and per-command descriptions trimmed to the single most useful verb so the message scans in two seconds. Tone: confident and direct rather than feature-list prose. MarkdownV2 escapes preserved throughout.
… (Polaris Round 3.1)
Operator bug 2026-05-25: after killing the bot, wiping
chronovisor_predictions + alert_rules, and restarting, OLD DMs still
fire — referencing predictions/alerts that no longer exist. The
operator's mental model is "after restart, only new emissions DM me".
Root cause: the Polaris G3 orphan-row guard in sendTelegramDm
suppresses DMs at ENQUEUE time only. Once a DM lands in
pending_notifications the poller drains it without re-checking the
underlying row. So predictions wiped between bot-down and bot-up
slip past — the queue rows survive the wipe, the poller picks them
up on session start, and they fire orphaned.
Two-layer fix:
Layer 1 — session-scoped expiry on startup.
- New expirePendingDmsOlderThan(beforeMs) in pending-queue.ts
marks-delivered every undelivered row with created_at < beforeMs.
- Poller captures process-start time on init and calls expiry once
with that timestamp. Anything queued by a previous session that
survived the restart is discarded instead of spuriously delivered.
- Cross-process safety: in api+bot split deployments the bot's
session-start expiry can't kill fresh DMs the api just emitted
because those rows have created_at > sessionStartMs.
Layer 2 — orphan-row re-guard at delivery.
- New metadata TEXT column on pending_notifications (idempotent
ALTER, backward-compatible — legacy rows have null metadata).
- sendTelegramDm now passes notification.metadata through to
enqueueTelegramDm.
- Poller parses metadata before sending: if metadata.predictionId
refers to a missing chronovisor_predictions row OR
metadata.ruleId refers to a missing alert_rules row, mark
delivered (without sending) and skip. Mid-session deletes are
caught even when the DM was already queued.
Together: restart = clean queue + any mid-session deletes flush
their queued DMs. The operator sees only DMs whose underlying
prediction/alert still exists at delivery time.
No schema migration required — column add is idempotent. 1630 tests
remain green; per-test in-memory DBs get the new column on first
ensureTable() call.
… (Polaris Round 3.2)
Operator bug 2026-05-25 follow-up: after restarting the bot the
operator saw /alerts list 42 system alerts armed by predictions
from previous sessions. Manual SQL wipes that only delete
chronovisor_predictions leave the pred_* alert rules dangling.
On the next session those alerts keep firing against the live
price feed and DM the operator about predictions that no longer
exist (the price-alert-bridge fires, the orphan-row guard
suppresses the DM body but the alert keeps re-arming a
notification every poll).
Adds pruneOrphanedPredictionAlertRules() in notifications/store.ts:
- select every alert_rule with id starting with 'pred_'
- strip the 'pred_' prefix and the '_upper'/'_lower'/'_tp1'/'_sl'
suffix to extract the underlying predictionId
- cross-check against chronovisor_predictions; collect IDs whose
prediction row is gone
- batch DELETE the orphans in a single statement
- returns the count for the startup log
Wired into bot startup before the pending-DM poller starts so the
prune runs once per process. Manual user alerts (no pred_ prefix)
are left untouched.
Together with Round 3.1's session-scoped DM expiry + delivery-time
orphan re-guard, this closes the operator's "restart still shows
old predictions/alerts" complaint at both the queue surface
(pending_notifications) and the rule surface (alert_rules).
In development the bot tolerates missing API keys (AI chat falls back,
market data still flows via Binance, etc.). In production missing keys
silently break user-facing features without a visible surface — the
operator only learns when users hit them. We enforce hard requirements
at boot so misconfiguration trips the process before users see it.
New `validateProductionConfig()` in loader.ts triggers when either
NODE_ENV='production' OR VIZZOR_REQUIRE_FULL_CONFIG=true (the manual
opt-in for staging / pre-prod checks). Validates:
- ANTHROPIC_API_KEY required — the AI chat surface depends on it
- TELEGRAM_BOT_TOKEN required — every operator entry point lives there
- ML_API_SECRET required when ml.enabled=true — without it the ml
sidecar accepts requests blindly
On missing values the bot throws at loadConfig() with a clear "missing
required value(s): X, Y" message naming the env vars to set. The check
is skipped in dev (NODE_ENV != production AND no opt-in) so local
development without an API key still works.
Zero behavior change for existing prod operators who already have the
required vars set; the check is opt-in for staging.
…d 3.3 A2)
Vizzor's existing global-error handlers print uncaught exceptions and
unhandled rejections to stderr. In production stderr ends up in docker
logs and never aggregates into "30 of these errors fired in the last
24h" intelligence. Sentry adds the missing aggregation layer with zero
PII (no user inputs, no chat IDs, no wallet addresses in error context).
New src/utils/sentry.ts owns:
- initSentry(release?) — idempotent boot init gated on SENTRY_DSN
env var. No-op when unset so dev runs stay quiet.
- captureError(err, ctx?) — every call site uses this instead of
importing @sentry/node directly so we can swap providers cleanly.
- captureMessage(msg, level, ctx?) — for non-Error degradation logs.
- flushSentry(timeoutMs) — drains queued events before shutdown.
PII scrub: Authorization-style headers (secret|token|key|auth) are
auto-redacted in beforeSend. We never put user inputs or PII into
error contexts ourselves.
Wiring in src/index.ts:
- initSentry() runs before feature module imports so early-boot
crashes get captured.
- unhandledRejection + uncaughtException handlers call captureError
BEFORE the stderr log. uncaughtException flushes 1.5s before
process.exit(1) so the event lands in the dashboard.
- onShutdown('sentry-flush', flushSentry, 'persist') drains the
queue on SIGTERM/SIGINT before DB closes.
Configuration:
- SENTRY_DSN — activates Sentry (absent = no-op)
- SENTRY_TRACES_SAMPLE_RATE — defaults to 0 (no tracing)
- NODE_ENV — tagged as the environment in Sentry
Cleared the high-impact vulnerabilities flagged in the production audit: before: 6 moderate + 1 high after: 2 moderate + 1 high (1 ignored) Patched: - postcss<8.5.10 → XSS via unescaped </style> in CSS stringify - brace-expansion → DoS in numeric range expansion - ws<8.20.1 → uninitialized memory disclosure in client receive - several others in the transitive graph Remaining advisories sit in deep transitive paths (uuid via jayson via @solana/web3.js) that we can't patch without forking; the @solana/web3.js maintainers have a tracking issue. Score is at the floor available without dropping wallet-chain support. Verification: - pnpm typecheck — clean - pnpm test — 1630/1630 passing
…istory (Round 3.3 A4)
The production audit flagged that the canonical queries on three hot
tables had to table-scan for lack of an appropriate index. With even
the modest ~10k-row prediction history the operator's already
accumulated, /precisions, /wr, and the prediction-resolver were doing
unindexed scans per request.
Added (all CREATE INDEX IF NOT EXISTS, idempotent):
chronovisor_predictions
idx_predictions_symbol_created (symbol, created_at DESC)
per-symbol history queries used by /precisions <sym>, /wr <sym>,
and the per-symbol panel rolls
idx_predictions_resolved_at (resolved_at) WHERE resolved_at IS NOT NULL
resolved-window aggregations used by AccuracyTracker.getResolvedRecords
and the calibration bootstrap replay
idx_predictions_user_symbol (user_id, symbol) WHERE user_id IS NOT NULL
per-user filters used by /wr per operator
idx_predictions_source_resolved (source, resolved_at)
forecast vs advisory cohort splits
alert_rules
idx_alert_rules_enabled_type (enabled, type)
price-alert-bridge scans active rules every poll cycle
funding_history
idx_funding_history_fetched_at (fetched_at)
standalone so the lazy-prune DELETE WHERE fetched_at < ? becomes
a range scan instead of a table scan once the table crosses ~100k
rows. The existing composite index is left-prefix on `venue` so
SQLite couldn't use it for an unbounded fetched_at filter.
The partial-index WHERE clauses are tried first; older SQLite versions
fall back to the simple index form via the try/catch in
ensureHotPathIndices(). Either way the column index keeps point
lookups fast.
Zero functional behavior change — these are pure planner hints.
… 3.3 A5)
The bot's /health endpoint reports leader-lock state + heartbeat
freshness — useful for uptime monitors but also exposes process
internals (pid, uptime, heartbeat lag) to any caller that can reach
the port. When the deployment binds past 127.0.0.1 (Docker bridge,
k8s service, public load balancer) a leaked endpoint URL exposes
that to the open internet.
Adds opt-in bearer-token gate:
- VIZZOR_HEALTH_TOKEN env var, when set, requires every request to
carry `Authorization: Bearer <token>`.
- Comparison uses constant-time string equality so an attacker can't
distinguish "wrong length" from "wrong value" via response latency.
- Returns 401 with a WWW-Authenticate: Bearer challenge on missing /
bad credentials.
- When the env var is absent, the endpoint stays open — backward
compatible with existing operator setups that bind to loopback only.
Recommended for any deployment exposing /health beyond 127.0.0.1:
- Generate a 32-byte hex token (openssl rand -hex 32)
- Set VIZZOR_HEALTH_TOKEN=<token> in the prod env
- Configure the uptime monitor with `Authorization: Bearer <token>`
Zero performance overhead when the env var is unset (one length check).
… (Round 3.3 B1)
The /health endpoint reports liveness but not rate / latency / error
counts. Without those the operator can't distinguish "bot is healthy"
from "bot is healthy but has emitted zero predictions in 4 hours" or
"bot is healthy but every Reddit fetch is timing out". Prometheus is
the standard pull-based surface; Grafana + Alertmanager hook on top.
New src/utils/metrics.ts owns:
- registry + default process metrics (cpu, mem, gc, event-loop lag,
vizzor_-prefixed)
- vizzor_predictions_emitted_total{symbol,horizon,direction,tier}
- vizzor_predictions_resolved_total{symbol,outcome}
- vizzor_telegram_dm_delivered_total{type,result}
- vizzor_data_source_fetch_total{source,result}
- vizzor_engine_cycle_duration_seconds histogram
- Small recorders (recordPredictionEmitted, recordTelegramDelivery,
etc.) so adding a metric never requires importing prom-client
outside this module — same isolation pattern as utils/sentry.ts.
Gated on VIZZOR_METRICS_PORT env var. When unset the module is a
no-op (zero overhead, no extra port consumed). Recommended prod
config:
VIZZOR_METRICS_PORT=9090
VIZZOR_METRICS_HOST=127.0.0.1 (default; expose only via reverse
proxy or scrape from same host)
The scrape endpoint is /metrics; content-type is the prom-client
default text/plain so Prometheus picks it up without per-scrape
config.
Wired in src/telegram/bot.ts next to startBotHealthServer() and
registered with the graceful-shutdown coordinator's 'inbound' phase
so the listener drains cleanly on SIGTERM.
Call sites that should now record (deferred — not in this commit):
- engine.predict() → recordPredictionEmitted + startEngineCycleTimer
- prediction-resolver → recordPredictionResolved
- telegram pending-poller → recordTelegramDelivery (ok/suppressed)
- new data sources (Deribit / Hyperliquid / Reddit / 4chan /
cross-venue-funding) → recordDataSourceFetch
Recording at call sites is intentionally split off so the
infrastructure ships first and the recorders can be added per-module
in follow-up commits without churning the whole graph.
…es (Round 3.3 B2)
The four Round 3.0 free-tier adapters (Deribit options, Hyperliquid
positioning, Reddit sentiment, 4chan /biz/) each caught their own
fetch errors and returned null. Fine for one-off failures but pessimal
for two classes:
- Transient: upstream blips for 1-3s, retry would succeed; we miss
the data.
- Persistent: upstream rate-limits us for 10+ minutes; we keep
hitting it on every cycle, burning request budget.
New src/utils/fetch-resilience.ts owns:
- resilientFetch(url, source, opts) — fetch with exponential-backoff
retry on 5xx + network + timeout errors (default 2 retries: 250ms
+ 600ms). 4xx does not retry (caller bug).
- Per-source circuit breaker — after 5 consecutive failures the
breaker opens and short-circuits all calls for 60s. Half-open
probe lets one request through after cool-down to test recovery.
- Metrics integration — every call emits
vizzor_data_source_fetch_total{source,result} with result ∈ {ok,
4xx, 5xx, timeout, network_error, breaker_open, *_retry}.
- resilientJsonFetch(url, source, opts) — convenience that returns
null on any failure (matches the existing adapter pattern).
Applied to:
- deribit-options.ts → sources 'deribit_index', 'deribit_book'
- hyperliquid-positions.ts → source 'hyperliquid'
- retail-sentiment.ts → sources 'reddit:cryptocurrency',
'reddit:cryptomoonshots', '4chan:biz'
The remaining adapters (cross-venue funding Bybit/OKX, eth-gas,
btc-mempool) are intentionally not migrated in this commit — they
already have stable upstreams (Bybit/OKX/etherscan/mempool.space)
and changing them in the same commit increases blast radius without
matching reward. They can adopt the same wrapper in a follow-up.
Zero behavior change when upstreams are healthy. Under degraded
conditions, the operator now sees:
- Successful retries instead of intermittent gaps
- The breaker_open counter spiking when an upstream stays down
- Clean recovery via the half-open probe when it comes back
…Round 3.3 B3 + B4)
Two adjacent production-readiness fixes in docker-compose.yml.
B3 — pin floating :latest image tags:
- postgres: timescale/timescaledb:latest-pg15 → 2.21.4-pg15
`latest-pg15` was silently shipping PG15 patch releases AND
TimescaleDB minor bumps on each compose pull, risking
unannounced behavior changes (TimescaleDB 2.18 already removed
deprecated APIs). Pinned to a verified tag; bumps now require
intentional staging validation + a snapshot backup.
- n8n: n8nio/n8n:latest → 1.117.4
Same reason. n8n minor releases have broken workflows multiple
times via undocumented API shape changes.
B4 — n8n healthcheck:
Container previously reported "up" even when the internal n8n
process crashed; depends_on couldn't tell. Added probe against
the built-in /healthz endpoint that n8n exposes by default.
Bumps from here on require either:
- A docker compose pull + smoke test in staging, OR
- A documented rollback path on the prod machine
Pre-prod operators upgrading from main: run `docker compose pull
postgres n8n` AFTER taking a backup. Postgres minor-version
upgrades within the same major are safe; TimescaleDB minor-version
upgrades are usually safe but consult the upstream release notes
before applying in prod.
…ation (Round 3.4)
Operator bug 2026-05-25: after restarting prod and running /reset all,
old predictions and alerts kept firing as DMs. Root cause: every
Vizzor instance on the same machine wrote to the SAME files under
~/.vizzor/ — SQLite DB, config.yaml, wallets, ML state, leader lock,
pending DM queue.
When the operator ran dev (in one terminal) and prod (in another)
side-by-side, the two instances thrashed each other:
- Dev writes a prediction → it lands in the shared chronovisor_predictions
- Prod delivers the DM for it (whichever process holds leader-lock)
- Operator runs /reset all on prod → wipes dev's data too
- Dev keeps running → re-emits new predictions → operator thinks the
reset failed because DMs from the "old" set keep arriving (they're
actually NEW emissions, but indistinguishable to the operator)
- Both bots fight for the same leader-lock; whichever loses runs in
viewer mode and goes dark on its UI surface
- Pending DM queue interleaves between the two instances
- Alert rules from one instance trigger DMs in the other
Fix: getConfigDir() now honors VIZZOR_DATA_DIR env var when set.
- Default: ~/.vizzor/ (no migration for existing operators)
- Prod recommended: VIZZOR_DATA_DIR=/Users/<user>/.vizzor-prod
Both instances now have fully independent SQLite DBs, configs,
wallets, ML state, leader locks, alert rules, pending DM queues. No
shared mutable state.
NOT a complete fix on its own: instances sharing the same
TELEGRAM_BOT_TOKEN still compete for the inbound update stream
(Telegram only delivers each /predict etc. to whoever calls
getUpdates first). For full dev/prod isolation create a separate
Telegram bot via @Botfather for prod and put its token in the prod
config.yaml under VIZZOR_DATA_DIR.
No schema migration. No behavior change for operators who don't set
VIZZOR_DATA_DIR. 1630 tests remain green.
…ompt (Round 3.5)
After the Round 3.0 signal expansion shipped (Deribit options, Hyperliquid
positioning, funding z-score, retail sentiment NLP, multi-TF CVD, volume
profile POC), the operator's chat free-text predictions were still using
only the classic TA + on-chain stack (RSI, MACD, OBV, VWAP, taker, F&G).
The new signals were COMPUTED on every chronovisor call but never named
in the narrative output — the operator had no way to know whether they
fired or contributed.
This commit teaches the chat AI to surface them explicitly. Three edits:
§3.2 Signal Consistency Check — 6 new rows added to the bullish/bearish
classification table covering fundingZScore, optionsTermInversion +
otmSkew, hyperliquidFundingRate, cvdMultiTfDivergence, volumeProfile
POC, and retail euphoria/capitulation flags.
§3.4b Round 3.0 Institutional + Retail Signals — new section. Three
mandatory surfacing rules:
(a) Single-horizon predictions must name each non-null Round 3.0
field in the Signal Breakdown section and fold its bias into the
direction call.
(b) Multi-horizon / hour-by-hour breakdowns must reference Round 3.0
state PER BUCKET — Round 3.0 signals carry equal weight to TA.
(c) Null/warming-up fields are surfaced explicitly ("Round 3.0
[field]: warming up (need N more samples)" or "n/a for this
asset (Deribit limitation)").
Includes a 6-glyph visual marker system (institutional / smart money /
retail contrarian / POC magnet / funding window / gas-mempool) and a
concrete example bucket showing how to format the output.
§3.7 Contrarian Indicators — 10 new rows for retailEuphoriaFlag,
retailCapitulationFlag, fundingZScore extremes, options put/call skew
extremes, dexCex funding divergence, and multi-TF CVD divergence —
each tagged with its glyph so the chat output is scannable.
Zero engine changes. The signals were already computed; this commit
unlocks their narrative visibility. Operator can now run "predict SOL
every 30min today" and see Round 3.0 fields per bucket instead of
just classic TA.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release v0.15.1 "Polaris"
Patch release on top of v0.15.0 "Orion". 11 commits covering the
WR diagnostics layer, per-user accuracy correctness, and the
reset / ghost-DM hygiene cluster surfaced during live operator
testing on 2026-05-21/22.
The release builds toward the operator's standing target of 70%+
tracked decisive WR by raising the emission bar, hard-skipping
unsafe-cycle calls, accelerating warmup exits, and giving the
operator runtime visibility into which signals are dragging WR
down.
A — Polaris diagnostics layer
A complete per-signal WR observability + correction loop. Every
signal now has a measured decisive WR, the engine can invert or
dampen a signal whose sign is flipped, and the operator can pin
overrides at runtime.
signal_outcomesSQLite table: per-prediction per-signal vote + outcome rows written by the resolver. Source of
truth for all per-signal accuracy queries.
signal (INVERT / DAMPEN / WATCH / KEEP) over configurable
window + sample threshold.
overrides applied before composite assembly. Supports both
auto (auditor-driven) and manual (operator-pinned) modes.
Persisted to
signal_polarity./wr-diagnoseTelegram command: per-signal WRbreakdown with inversion / dampen / watch sections + active
overrides.
/signal-overrideTelegram command: operator-pinnedpolarity that wins over the auto auditor.
pnpm wr:auditCLI: exits non-zero when any activesignal sits below the INVERT threshold so CI can gate releases
on per-signal calibration health.
B — Per-user accuracy correctness
Closes a long-standing visibility bug where every operator on the
same Telegram bot saw a shared WR aggregated across all users.
user_idcolumn is now persisted atlogPredictioninsert time (the read paths already filtered correctly — the
bug was that the write path never populated the column).
engine.predict()gains anoptions.userIdthatflows through to every
logPredictioncall site (forecast,directional with TP/SL, forecast resolve).
from an authenticated user thread
userIdthrough:/predict,/diagnose,/polymarket,get_prediction,predict_price,polymarket_edge, bulk-ladder.logging path.
C — Predictiveness signals (leading-indicator alpha)
Engine now anticipates moves via leading flow indicators instead
of reacting to lagging structure. Operator standing rule:
predictions must anticipate big moves, not chase them.
leadingFlowsignal aggregator: taker momentum +orderbook imbalance + whale flow + CVD slope + funding
inflection. Injected with 30% weight at the front of the
on-chain composite via
sigCfg.onChain.leadingFlowWeight.CVD's recent slope is bearish (demand is being eaten),
neutralize the structural vote instead of letting it outvote
real-time order-flow truth.
failed_support_break,failed_resistance_rejection,asian_range_continuation_bull/bear,funding_peak_reversal_bearish,funding_trough_reversal_bullish,sweep_and_reverse_bullish_trap,sweep_and_reverse_bearish_trap.stable_supply_expanding_bullish/stable_supply_contracting_bearish,alt_rotation_confirmed_alt_bullish,btc_rotation_dominant_btc_bullish(TOTAL2/TOTAL3 dominancealgebra).
vol_term_inverted_break_imminent,vol_acceleration_post_consolidation(realized vol + termstructure).
realized σ + acceleration, pure compute on existing OHLCV.
MarketContextgainsspotPrice,stablecoinNetExpansionUsd,altDominancePct,nonBtcEthDominancePct,realizedVol1h/4h,volTermSlope,volAcceleration.D — Band-edge deformation (post-forecast)
Two post-processors run after
computePredictionForecastcloses,mutating a shallow copy of the forecast so the band absorbs
predictable wick patterns the prior grader counted as MISS.
applyLiquidationMagnet: when a significant long /short liquidation cluster sits beyond the band edge on the
matching side, stretch the edge to
(cluster ± small buffer)capped at 2× the original half-width. Symbol-agnostic — runs
off
liquidationProximitydata the engine already gathers.applyMomentumBandWidth: when taker momentum + OBimbalance derivative + CVD short slope stack one way with
magnitude ≥ 0.25, widen the band on that side by up to 1.8×
while modestly widening the other edge.
E — Warmup acceleration (α + β + γ)
Three independent accelerators that reduce the real-world sample
budget needed to exit warmup-conservatism and engage full-strength
prediction.
1000 most-recent resolved predictions through the calibrator +
rule tracker on boot when the persistence layer comes back
empty. Fresh DB or post-
/resetstart exits warmup in secondsinstead of 30+ live resolutions. Idempotent.
cliff with
sqrt(samples / minSamples)scaling on theprobability + confidence caps so confidence climbs smoothly
through warmup instead of unlocking on a single lucky
resolution.
calibrator now accepts a number ∈ [0,1] in addition to a
boolean wasCorrect, feeding partial-credit resolution scores
directly to the SGD update. Empirically halves the sample
count needed to converge on the same MAE.
F — Quality gates (raise the floor)
chronovisor.perFamilyMinLogConfidence(micro=0.45, intraday=0.40, swing=0.35, macro=0.30). Short
horizons (where prod WR sat at 15–32%) get a higher bar.
around 0.5 — uncertain calls now correctly file as RANGE
instead of being emitted as razor-thin directional predictions.
wrWatchdogconfig drives a rolling-WR warning DM inthe resolver: when recent decisive WR falls below
warnBelowover
windowSizesamples, the watchdog emits an infonotification naming the worst-performing signal.
with
cycleHealth.status ∈ {degraded, unsafe}ANDsystemConfirmations < 3are now hard-skipped atqualification time so they never enter the resolution
pipeline. Cold-start (
warmingUp) is exempt.🚦 Gate: 0/3 — SKIPlabel was misleading for RANGE plays (sideways forecasts have
no direction to confirm). RANGE plays now render
🚦 Gate: RANGE play (3-system gate applies to directional only).G — Reset hygiene + ghost-DM elimination
Operator bug report 2026-05-22:
/reset alland manual sqliteDELETE runs were leaving stale state that re-emerged as ghost
predictions in the next cycle. Three causes, fixed in one atomic
operation.
performFullReset(): atomic wipe of everyprediction-related table including Polaris additions
(
signal_outcomes,signal_polarity,calibration_family,calibration_symbol,rule_stats,volatility,weights,patterns), every alert rule, every pending notification.ChronoVisorEngine.getResetCallbacks(): exposesnamed
clear()functions for every in-process registry theengine owns (CalibratorRegistry, RuleTracker,
VolatilityRegistry, SignalPolarityRegistry, pattern library,
warmup-state cached counts). Without flushing these, the next
debounced ml-state save wrote the pre-wipe snapshot straight
back to DB.
sendTelegramDm: beforedelivering a
prediction_resolvedorprice_thresholdDMthe channel checks that the underlying prediction /
alert-rule row still exists. If not, the DM is suppressed.
Eliminates the post-wipe ghost bubble. Resolver writes
predictionIdinto metadata so the guard has a key./reset allreply gains a visible "VIZZOR STATSRESET" divider block so chat scrollback has an unambiguous
"everything above is stale" marker (Telegram permanently
keeps history).
CALIBRATION_RETRAIN_INTERVALnow also triggers signal-polarity reconciliation from the auditor so a freshly-seen
sign error is corrected within ~20 resolutions of becoming
statistically defensible. Manual overrides are preserved.
H —
/predictcard rendererOperator-requested compact trader-channel format. One card per
horizon, MarkdownV2-escaped, with branded glyphs for known
assets and a clean fallback for everything else.
Multi-horizon replies separate cards with a horizontal rule and
surface a "Multi-horizon consensus" chip when per-horizon
directions disagree.
symbol-displayregistry holds the branded glyphs with operatorextension via
chronovisor.symbolDisplay.<TICKER>in config.Unknown symbols fall back to a generic icon + raw ticker —
features still apply universally (no symbol-list gating).
Impact
operator's history. Pre-fix, every operator on the same bot
saw a shared aggregate.
unsafe cycles no longer enter the resolution pipeline. The
raised per-family floors + widened dead zone push genuinely
uncertain calls to file as RANGE.
graduated SGD label cut the practical sample budget by roughly
half on synthetic + replay tests.
orphan-row guard plus the in-memory registry flush.
/wr-diagnose,/signal-override,the rolling-WR watchdog DM, and the polarity reconciliation
DM give the operator a closed-loop view of which signals are
helping vs hurting.
Breaking changes
None. All new config fields ship with sane defaults; existing
configs without the new keys behave identically to v0.15.0 except
for the higher
minLogConfidencefloor (0.25 → 0.30) and thewider sideways dead zone (0.495/0.505 → 0.46/0.54), both of
which are documented as intentional WR-quality changes.
Test plan
pnpm typecheck— 0 errorspnpm lint— 0 errors (334 pre-existing non-null-assertion warnings)pnpm test— 133 files, 1630 tests, all green/wrand/wr-diagnosepolled hourly/reset allend-to-end: confirm no ghost DMs land afterthe wipe completes
/predict BTC 30mfrom two different operator accountsand confirm
/wrshows distinct per-user historiesRound 2 — 2026-05-23 (additional commits)
Follow-up sprint after operator MISS data from 2026-05-22/23 showed
that BTC tracked WR was still sitting at ~37% with RANGE forecasts
landing 0.01–0.17% outside the displayed band and SMC self-
contradictions (bearish bias + supply zone, yet bullish vote) leaking
into directional emissions. Round 2 closes those specific failure
modes and adds the leading-edge signals + calibration acceleration
that didn't make Round 1.
Quick wins (ω₁–ω₆)
hard-skipped at qualification time (was advisory).
when
smc.votedisagrees with bothsmcLastBreakTypeandsmcBias.Negative-EV trades never reach chat regardless of profile.
/precisionsand/wrrender "— (n/m, warming up)"until a calibrator has at least 10 decisive resolutions. Stops
the bot from publishing 14.5% / 22% noise WR figures.
volTermSlope,volAcceleration)feed band-magnitude lift in
composeMomentumDirection(directionuntouched — vol is a magnitude indicator only).
/predictcard gains a "🔬 Why" section per emission:per-signal CF contributions, triple-system status, guardrails
snapshot, short reasoning paragraph. On by default.
Calibration acceleration (δ ε θ)
calibration parameters via
inheritFamilyPriorIntoShard()insteadof starting cold. New tokens skip the ~48 h warmup window.
addition to per-family and per-(symbol, family). Each resolution
routes to all four buckets so a bull-trained curve no longer
poisons bear emissions.
updatePlattSGD step is scaled by|p − 0.5| × 2 + 0.5.High-confidence misses correct the calibrator harder than low-
confidence ones.
New free-tier signals (H K L N π σ τ)
public funding-rate endpoints, 30 s cache, exposes weighted
average +
binanceDivergencePct+maxDivergencePct. Drivesthree new FOL rules around single-venue over-leverage.
baseline + 3× spike threshold. A 3×-baseline gas spike inside
5 min historically precedes ETH liquidation cascades by 30–90 min.
2×
isCongestedthreshold. Fee acceleration is a 30–90 minleading volatility signal.
Control (POC), 70%-volume value area,
insideValueAreaflag.Pure compute on existing klines. Drives
volume_profile_above/below_va_mean_revert_*rules.fetch, classifies 5m-bullish / 1h-bearish (and mirror) as multi-
TF divergence — smart-money distribution vs retail pump signal.
08:00 / 16:00 UTC funding settlements emit pre-payment squeeze
and post-payment relief / capitulation calls.
alignment (London / NY) combined with active leading flow
emits session-continuation rules.
Rationale
The Round 2 surface targets the exact MISS pattern in the operator's
2026-05-22/23 screenshots: dead-session emissions, SMC self-
contradictions reaching chat, R:R<1 trades labeled "Tentative",
WR figures published from <10-sample shards, and missing leading-
edge signals (cross-venue funding, gas / mempool, multi-TF flow)
that would have flagged the BTC RANGE breaks before they happened.
Verification
pnpm typecheck— 0 errorspnpm lint— 0 errors (345 pre-existing non-null-assertion warnings)pnpm test— 133 files, 1630 tests, all greenRound 2 commits
6b7da0f— fix(ml): hard-skip dead-session + R:R<1 emissions (ω₁ ω₃)eb4ef8f— feat(telegram): min-sample-gated WR + full-context /predict (ω₄ ω₆)dbb27b1— feat(ml): realized-vol band sizing + multi-TF CVD divergence (ω₅ π)2187247— feat(ml): regime + cross-symbol + confidence-weighted calibration (ε δ θ)9d61dfb— feat(data): cross-venue funding + ETH gas + BTC mempool sources (H K L)4be9c7e— feat(ml): volume profile + FOL rules + engine wiring for Round 2 (ω₂ + N + H I J K L M N π σ τ)Round 2.5 — 2026-05-24 (additional commits)
After Round 2 shipped, live operator data showed BTC tracked WR
sitting at ~32% on n=10 decisive resolutions. The number itself is
inside the 95% CI of a coin flip at that sample size, but examining
the emission stream surfaced five concrete gaps Round 2 did not
close — all of which let weak-edge predictions reach chat regardless
of how good the calibrator gets.
Gap 1 — emission-floor bump (config)
perFamilyMinLogConfidencefloors raised:micro 0.45→0.58 · intraday 0.40→0.55 · swing 0.35→0.52 · macro 0.30→0.50.A 55%-probability prediction cannot resolve at 70% even with perfect
calibration — the math forbids it. Volume drops ~40%, surviving
emissions resolve with meaningfully better EV. Floors lower back to
Round 2 values once decisive samples per family cross 60 and
calibration is trusted across all 3 regime shards.
Gap 2 — adversarial signal-conflict detector (ι)
New
src/core/chronovisor/signal-conflict.tsscans theonChain × logicRules pair and flags conflict when both |CF| ≥ 0.6 AND
their signs oppose. Engine hook fires before the existing
systemConfirmations / targetEqualsEntry normalize-to-sideways block:
on conflict the prediction is reframed as RANGE, probability is
collapsed toward 0.5–0.6, the forecast is recomputed as sideways,
and a
signal_conflictreasoning line is threaded into the triggersnapshot. Catches the exact SMC-bearish-vs-ICT-bullish-vs-RANGE-call
pattern in the operator's old MISS screenshots.
Gap 3 — composite EV gate
ω₃ catches R:R<1 and the per-family floor catches low-confidence calls
but a 56% / R:R 1.05 combo escapes both axes individually with
near-zero edge. Added
prediction-qualification.tsrule:skip if (probability − 0.5) × edgeRatio < 0.04for directionalpredictions with R:R ≥ 1.0. Knocks out the marginal positive-RR but
low-edge tail that was hitting the calibrator with coin-flip outcomes.
Gap 4 — regime-aware emission throttle
Round 2's ε regime-bucketed calibration LEARNS per regime but didn't
THROTTLE EMISSIONS in a known-bad regime. Added:
ConfidenceCalibrator.getHitRate()— cumulative bin-aggregate WRCalibratorRegistry.getRegimeShardHitRate(family, regime)shouldTrackPredictionacceptsregimeShardHitRate + regimeShardSamples + regimeThrottleRoll. Throttle fires whensamples ≥ 20 AND hitRate < 0.50 AND uniform random roll < 0.5 →
50% emission cut in known-bad regime.
currentRegime∈ {bull, bear, chop} →CalibrationRegime∈ {bull, bear, consolidation}, queries theshard's hit rate, and passes
Math.random()per call so thethrottle is probabilistic (not a hard veto). Tests inject a
deterministic roll for repeatability.
Gap 6 — bootstrap calibrator from 30-day time window
Round 1's α bootstrap pulled the most recent 1000 resolved rows.
The count-based bound was wrong-shaped for low-volume operators
(may produce <1000 rows in 30+ days) and high-volume operators
(1000 rows may span only 8h and miss regime variety). Replaced
with a time-windowed pull: 30-day window by default
(
VIZZOR_BOOTSTRAP_WINDOW_DAYS) capped at 5000 rows(
VIZZOR_BOOTSTRAP_MAX_ROWS). Zero behavior change for mid-volumeoperators with <5000 rows in 30 days.
Round 2.5 commits
8183d89— feat(config): raise per-family emission floors (Gap 1)d5c662b— feat(ml): bootstrap calibrator from 30-day time window (Gap 6)93efddd— feat(ml): adversarial signal-conflict detector ι (Gap 2)7bff8d5— feat(ml): composite EV gate + regime-aware emission throttle (Gap 3 + Gap 4)Verification
pnpm typecheck— 0 errorspnpm lint— 0 errors (345 pre-existing non-null-assertion warnings)pnpm test— 133 files, 1630 tests, all greenDeferred to Round 3
Gap 5 (bring up the Python ML sidecar) is environmental, not a
code change — handled outside the PR. The combined expected WR lift
from Gaps 1–4 + Gap 6 once 20+ samples accumulate is roughly +10pp
to +15pp on the tracked-decisive cohort, primarily by reducing
low-edge emission volume rather than improving any single signal.
Round 2.6 + Round 3.0 — 2026-05-25 (additional commits)
After Round 2.5 shipped the operator's WR snapshot (
/precisions) showedtracked decisive WR at 28% on n=71 — the qualification gate was selecting
WORSE predictions than the broader 32% cohort, because every Round 2.5
gate was directional-only and RANGE plays at 50% probability flooded
through. Round 2.6 closes the RANGE gap. Round 3.0 adds four free-tier
leading-indicator signals worth an estimated +7–12pp combined WR lift.
A small sweep at the end relocates two hardcoded symbol lists to config.
Round 2.6 — RANGE quality bar
The 45m BTC HIT card surfaced the bug:
RANGE 50% · 0/3 systems · dead session · Tier: Tracked. A random-walk play counted toward WR. Round 2.6adds three sideways-specific gates inside
shouldTrackPrediction:information; only emit when the model has at least mild conviction.
signals; a single signal is noise.
< 0.5× ATR(14). Bands narrower than half a 1-bar range have ~50/50
"actual inside" by random walk; no edge.
Expected: ~50% drop in RANGE emission volume, surviving RANGE plays
carry real information, WR denominator stops being inflated by random
HITs.
Round 3.0 — four new free-tier signals (+10 FOL rules)
funding_historyfed by every cross-venue-funding fetch;computeFundingZScoreexposes z-score per venue (Binance / Bybit / OKX, 30-day window, lazy-pruned 35d retention)funding_zscore_extreme_long_bearish(z > +2 → SHORT),funding_zscore_extreme_short_bullish(z < -2 → LONG)get_book_summary_by_currency+get_index_price, no auth. ATM IV per tenor (1-7d, 8-30d), term inversion flag, put/call OI ratio, 25-delta-proxy skewoptions_iv_term_inverted_volatility_imminent,options_put_skew_extreme_bearish_reversal_imminent,options_call_skew_extreme_bullish_top,options_iv_crush_post_event_relief_bullishmetaAndAssetCtxs, no auth. Per-asset funding rate + open interest. DEX-vs-CEX funding divergence flags smart-money leverage concentrationhyperliquid_funding_long_skew_bearish(HL funding >20% above Binance → SHORT),hyperliquid_funding_short_skew_bullish(HL >20% below → LONG).json(UA header only); 4chan/biz/catalog.json. Lexicon-based sentiment + mention-spike z-score per ticker, 6h rolling windowretail_euphoria_top_warning_bearish(mention spike >3σ + sentiment > 0.7 → SHORT),retail_capitulation_bottom_warning_bullish(mirror)All four sources are free, no Nansen / Arkham / Glassnode-paid per
feedback_whale_flow_edge.md. Each gracefully returns null/undefinedfor assets the source doesn't cover, and the dependent FOL rules skip
cleanly via typeof guards — no false fires, no degradation for symbols
outside coverage.
Sweep — hardcoded symbol lists → config
Two source-level hardcoded coin lists relocated to config-driven
defaults (per
feedback_no_hardcoded_symbols.md):MAJOR_SYMBOLS(collector.ts, 23 tokens) →collector.symbolsQUICK_SYMBOLS(telegram/ui-helpers.ts, 7 tokens) →telegram.quickSymbolsgetCollectorDefaults()andgetQuickSymbols()read the values viagetConfig()at call time with safe pre-config-load fallbacks so anoperator can edit YAML + reload without restart. Runtime Telegram
commands (
/collector-symbols,/quick-symbols) deferred to Round 3.1to complete the spirit of
feedback_no_file_edits_for_operator_config.md.Round 2.6 + 3.0 commits
9dafb16fix(ml): range quality bar + min-band-width gate (Round 2.6)8667f23feat(data): funding-rate 30-day history + z-score (Round 3.0 H)ae7d615feat(data): deribit options term structure + skew (Round 3.0 D)3229cadfeat(data): hyperliquid DEX positioning + funding divergence (Round 3.0 X)1ac2fdafeat(data): reddit + 4chan retail sentiment NLP (Round 3.0 R)981f4f9refactor(config): move MAJOR_SYMBOLS + QUICK_SYMBOLS to config-driven defaultsVerification
pnpm typecheck— 0 errorspnpm lint— 0 errors (352 pre-existing warnings)pnpm test— 133 files, 1630 tests, all greenRound 3.1 + 3.2 — 2026-05-25 (post-restart hygiene)
Operator restart bug: after killing the bot, wiping
chronovisor_predictionsalert_rules(or only one of them), and restarting, old DMs and oldauto-armed alerts still fired. The Round 1 G3 orphan-row guard was only
applied at notification ENQUEUE time, and prediction-armed alert rules
survived unless the operator explicitly ran
/reset alerts. Two fixes closethe gap at every surface.
Round 3.1 — session-scoped DM delivery + orphan re-guard at poller
Problem: DMs sit in
pending_notificationsbetween enqueue and pollerdrain. The G3 orphan guard fires only at enqueue. When the bot is killed
mid-flight and predictions are wiped before restart, the poller picks up
stale rows on the next session and delivers them referencing rows that no
longer exist.
Layer 1 — session expiry at startup. New
expirePendingDmsOlderThan(beforeMs)inpending-queue.tsmarks-deliveredevery undelivered row with
created_at < beforeMs. The poller capturesDate.now()at boot and runs the expiry once. Anything queued by a previoussession is discarded instead of spuriously delivered. Logs:
Expired N stale pending Telegram DM(s) from prior session(s).Layer 2 — orphan re-guard at delivery. New
metadata TEXTcolumn onpending_notifications(idempotent ALTER, null for legacy rows).sendTelegramDmpassesnotification.metadatathrough. The poller parsesmetadata before each send and re-checks
chronovisor_predictions/alert_rulesexistence — if the underlying row is gone the DM is markeddelivered without sending. Mid-session deletes are caught even when the
DM was already queued.
Round 3.2 — prune orphaned prediction-armed alert rules at startup
Problem: Predictions arm price-threshold alert rules with id
pred_<predictionId>[_upper|_lower|_label]. Manual SQL wipes that onlydelete
chronovisor_predictionsleave these rules dangling. On the nextsession the price-alert-bridge keeps firing them against live prices and
the operator sees 42 system alerts on
/alertsfor predictions that nolonger exist.
Fix: New
pruneOrphanedPredictionAlertRules()innotifications/store.tsruns once at bot startup:SELECT id FROM alert_rules WHERE id LIKE 'pred_%'pred_prefix and_upper/_lower/_tp1/_slsuffix to extractthe underlying
predictionIdchronovisor_predictions; collect IDs whose row is goneManual user alerts (no
pred_prefix) are left untouched. Logs:Pruned N orphaned prediction-armed alert rule(s).Round 3.1 + 3.2 commits
9846f56fix(telegram): session-scoped DM delivery + orphan re-guard at poller (Round 3.1)ba52277fix(telegram): prune orphaned prediction-armed alert rules at startup (Round 3.2)fdfd14bfeat(telegram): refresh /start welcome + first-run nudge with chat-first framingVerification
Manual operator test:
/alertsafter startup → should show 0 system alerts (was 42)Pruned N orphaned prediction-armed alert rule(s)andExpired N stale pending Telegram DM(s)lines/predict BTC 1h, kill bot mid-resolution, wipe predictions,restart → resolved-DM does NOT fire (orphan guard at delivery suppresses)
Round 3.3 — 2026-05-25 (production hardening: Sprint A + B)
After the production-deployment audit flagged 5 BLOCKERs + 12 HIGHs,
this sprint clears all 9 critical items so the branch is shippable.
Post-launch items (coverage thresholds, runbook, MEDIUM/LOW) are
intentionally deferred until the bot is in production where the
operator can observe and prioritize them with real data.
Sprint A — Pre-launch blockers
A1 — env-var validation fail-fast (
config/loader.ts)New
validateProductionConfig()triggers whenNODE_ENV=productionOR
VIZZOR_REQUIRE_FULL_CONFIG=true. Throws at boot whenANTHROPIC_API_KEY,TELEGRAM_BOT_TOKEN, orML_API_SECRET(when ml.enabled) is missing. Dev runs without keys still work.
A2 — Sentry error tracking (
utils/sentry.ts)New
@sentry/nodeintegration gated onSENTRY_DSN. CapturesunhandledRejection + uncaughtException before the existing stderr
log. PII scrub redacts Authorization-style headers. Graceful
flush on SIGTERM. No-op when DSN unset.
A3 — security patches (
pnpm up)Cleared 6 moderate + 1 high advisories (postcss XSS, brace-expansion
DoS, ws uninitialized-memory disclosure, ...). Remaining 2 moderate
floor available without dropping wallet-chain support.
A4 — hot-path DB indices
Added composite + partial indices on
chronovisor_predictions(symbol+created_at, resolved_at, user_id+symbol, source+resolved_at),
alert_rules(enabled+type), and a standalonefunding_history(fetched_at) so the lazy-prune query becomes a range scan instead
of a table scan once the table grows past ~100k rows.
A5 — /health auth (
telegram/health-server.ts)Optional bearer-token gate via
VIZZOR_HEALTH_TOKEN. Constant-timecomparison. Returns 401 with WWW-Authenticate challenge. Open
endpoint preserved when env var unset (back-compat with loopback-
only setups).
Sprint B — Pre-launch reliability
B1 — Prometheus metrics endpoint (
utils/metrics.ts)New
/metricsendpoint gated onVIZZOR_METRICS_PORT. Exposesdefault process metrics + four counters
(
vizzor_predictions_emitted_total,vizzor_predictions_resolved_total,vizzor_telegram_dm_delivered_total,vizzor_data_source_fetch_total) + anvizzor_engine_cycle_duration_secondshistogram. Recorders aresmall wrappers so call sites never import prom-client directly.
B2 — retry / backoff / circuit breaker (
utils/fetch-resilience.ts)New
resilientJsonFetch(url, source, opts)adds 2 exponential-backoff retries on 5xx + timeouts, per-source circuit breaker (5
consecutive failures → 60s open with half-open probe on recovery),
and metrics on every call. Applied to Deribit options, Hyperliquid
positioning, and Reddit + 4chan retail-sentiment adapters. Remaining
stable upstreams (Bybit, OKX, Etherscan, mempool.space) deferred
to a follow-up — changing them in the same commit increased blast
radius without matching reward.
B3 — image version pins (
docker-compose.yml)postgres: timescale/timescaledb:latest-pg15 → 2.21.4-pg15,n8n: latest → 1.117.4. Bumps now require staging validation +backup; no more silent breaking changes via floating tags.
B4 — n8n healthcheck (
docker-compose.yml)Probe against n8n's built-in
/healthz. Container no longerreports "up" when the internal process is crashed.
Round 3.3 commits
bc0ba42feat(config): fail-fast on missing production env vars (A1)28a3c0ffeat(ml): sentry error tracking with graceful flush on shutdown (A2)d90502echore(deps): patch security advisories via pnpm up (A3)5313683perf(data): hot-path indices on predictions / alert_rules / funding_history (A4)9de0ae0feat(telegram): optional bearer-token auth on /health endpoint (A5)7a60c22feat(ml): prometheus metrics endpoint with engine + delivery counters (B1)1222052feat(data): retry / backoff / circuit-breaker on Round 3.0 data sources (B2)3d4f701chore(deps): pin postgres + n8n image versions, add n8n healthcheck (B3 + B4)Production env vars introduced
NODE_ENV=productionVIZZOR_REQUIRE_FULL_CONFIG=trueSENTRY_DSNSENTRY_TRACES_SAMPLE_RATEVIZZOR_HEALTH_TOKEN/health(A5)VIZZOR_METRICS_PORT/metricsserver (B1)VIZZOR_METRICS_HOSTVerification
pnpm typecheck— 0 errorspnpm lint— 0 errorspnpm test— 133 files, 1630 tests, all greenpnpm audit --prod— 2 moderate + 1 high (1 ignored), all deeptransitive via @solana/web3.js (upstream tracking issue)
Deferred to post-launch (operator observes + prioritizes)
(engine.predict, prediction-resolver, pending-poller, data
sources) — infrastructure ships first, then per-module recorders
resilientJsonFetch(stable upstreams; lower priority)DEPLOYMENT.md/reset notificationsand/reset configcommands