Complete observability setup for production deployments using AbsurderSQL with --features telemetry.
The monitoring stack provides:
- 4 Grafana Dashboards with 28 panels for real-time visibility
- 18 Alert Rules with severity-based routing
- 26 Recording Rules for pre-aggregated metrics
- Complete Runbooks for every alert type
- Browser DevTools Extension for WASM telemetry debugging
Build AbsurderSQL with telemetry support:
# For native applications
cargo build --features telemetry
# For WASM/browser
wasm-pack build --target web --features telemetryAdd a /metrics endpoint to your application. See examples in the main README for axum and actix-web.
Add this job to your prometheus.yml:
scrape_configs:
- job_name: 'absurdersql'
static_configs:
- targets: ['localhost:9090']
scrape_interval: 15sImport the JSON files from grafana/ into your Grafana instance:
query_performance.json- Query metrics and slow query detectionstorage_operations.json- Block I/O and cache performancesystem_health.json- Error rates and system statusmulti_tab_coordination.json- Multi-tab sync debugging
Add prometheus/alert_rules.yml to your Prometheus configuration:
rule_files:
- /path/to/absurdersql/monitoring/prometheus/alert_rules.ymlFor WASM/browser deployments:
- Open Chrome:
chrome://extensions/ - Enable "Developer mode"
- Click "Load unpacked"
- Select
../browser-extension/directory
See ../browser-extension/INSTALLATION.md for complete instructions.
Monitors SQL query execution and identifies performance bottlenecks:
- Query latency percentiles (p50, p90, p99)
- Query rate and error rate
- Slow query detection (>100ms)
- Query type breakdown (SELECT/INSERT/UPDATE/DELETE)
Tracks block storage performance and cache efficiency:
- Block read/write rates
- Cache hit ratio and effectiveness
- Storage layer latency
- Block allocation metrics
Overall system health and error tracking:
- Total error rate by type
- Transaction success rate
- Active connections
- Resource utilization
Debugging multi-tab synchronization:
- Leader election status
- Write queue depth
- Broadcast channel activity
- Sync operation latency
- HighErrorRate - >5% errors over 5 minutes
- ExtremeQueryLatency - p99 query time >1 second
- NoQueryThroughput - Zero queries for 5 minutes (potential deadlock)
- StorageFailures - >3 storage failures per minute
- ElevatedErrorRate - >2% errors over 5 minutes
- SlowQueryDetected - Queries consistently >100ms
- LowCacheHitRate - Cache hit rate <60%
- MultiTabSyncDelayed - Sync operations >500ms
- LeaderElected - New tab became leader
- FirstQueryExecuted - Database initialization
See RUNBOOK.md for detailed remediation procedures for each alert.
Pre-aggregated metrics for faster dashboard queries:
absurdersql:query_rate- Queries per secondabsurdersql:error_rate- Errors per secondabsurdersql:error_ratio- Error percentageabsurdersql:query_latency_p99- 99th percentile latencyabsurdersql:cache_hit_ratio- Cache effectiveness- And 21 more...
For debugging WASM telemetry in the browser:
Features:
- Real-time span list with search/filtering
- Export statistics visualization
- Buffer inspection
- Manual flush trigger
- OTLP endpoint configuration
Installation: See ../browser-extension/INSTALLATION.md
Application Code
↓
[AbsurderSQL]
↓
Observability Layer (--features telemetry)
↓
┌───┴───┐
↓ ↓
Prometheus OpenTelemetry
↓ ↓
Grafana DevTools Extension
↓
Alerts
↓
Runbook Procedures
monitoring/
├── README.md # This file
├── RUNBOOK.md # Alert remediation procedures
├── grafana/ # Grafana dashboards
│ ├── query_performance.json # 7 panels
│ ├── storage_operations.json # 6 panels
│ ├── system_health.json # 8 panels
│ └── multi_tab_coordination.json # 7 panels
└── prometheus/ # Prometheus configuration
└── alert_rules.yml # 18 alerts + 26 recording rules
- Test Setup - Use
../examples/devtools_demo.htmlto generate test telemetry - Configure Alertmanager - Set up alert routing and notification channels
- Tune Thresholds - Adjust alert thresholds based on your workload
- Monitor Dashboard - Regularly review dashboards for anomalies
- Follow Runbooks - Use
RUNBOOK.mdwhen alerts fire
- Verify
--features telemetrywas enabled during build - Check
/metricsendpoint is accessible - Confirm Prometheus is scraping (check Prometheus UI targets page)
- Verify Prometheus datasource is configured in Grafana
- Check time range selector (default: last 1 hour)
- Ensure application is generating queries
- Check Prometheus Rules page to see if rules are loaded
- Verify alert expressions evaluate to true (test in Prometheus UI)
- Confirm Alertmanager is configured
- Verify extension is loaded (check
chrome://extensions/) - Open browser console and look for
[Content],[DevTools],[Panel]logs - Ensure demo page is using
window.postMessageto send telemetry
For issues or questions:
- Check
RUNBOOK.mdfor common problems - Review dashboard panels for debugging clues
- Check Prometheus logs for scraping errors
- Inspect DevTools extension console for message flow