Skip to content

Project 3 eBPF Bot

Huzefa Husain edited this page Jan 1, 2026 · 4 revisions

Project 3 – eBPF Coverage Bot

The eBPF Coverage Bot simulates an observability feedback loop: it inspects coverage, decides what to probe next, processes that signal, and stores results — all while emitting traces, metrics, and logs through OpenTelemetry.

Architecture

flowchart LR
  A[CoverageBot] -->|coverage map| B[CoverageOrchestrator]
  B -->|next probe| C[SignalProcessor]
  C -->|processed| D[SignalStore]
  D -->|store/load| E[UnifiedPipeline]
  E -->|returns next probe| B
Loading

Telemetry pipeline:

flowchart LR
  App[ebpf-bot Python] -->|OTLP gRPC 4317| OTel[OTel Collector]
  OTel -->|Traces| Jaeger[Jaeger]
  OTel -->|Metrics| Prometheus[Prometheus]
  OTel -->|Logs| File[otel-logs/logs.json]
  Prometheus --> Grafana[Grafana]
Loading

Scope Completed

1) OpenTelemetry instrumentation

  • Spans across core modules: CoverageBot, CoverageOrchestrator, UnifiedPipeline, ProbeInjector, SignalReceiver, SignalProcessor, SignalStore.
  • Span attributes/events added to capture coverage details and decision metadata.
  • Unit and E2E tests validate span names, attributes, and hierarchy.

2) Metrics and logs

  • Metrics emitted from UnifiedPipeline:
    • ebpf_bot_decisions_total (counter)
    • ebpf_bot_decision_latency_ms (histogram)
    • ebpf_bot_errors_total (counter)
  • Logs are correlated with trace/span IDs.
  • Demo scripts: scripts/emit_trace.py, scripts/emit_telemetry.py.

3) OpenTelemetry Collector

  • otel-collector-config.yaml defines traces, metrics, and logs pipelines.
  • Exporters: OTLP -> Jaeger, Prometheus metrics, debug + file logs.
  • Logs persisted to otel-logs/logs.json.

4) Observability stack

  • Docker Compose services: jaeger, otel-collector, prometheus, grafana.
  • Grafana provisioned with Prometheus datasource and dashboard.

5) Testing and validation

  • Metrics test: tests/test_metrics_pipeline.py (validated).
  • Tracing tests across components and end-to-end.

Quickstart

From projects/ebpf-bot:

docker compose down && docker compose up
python3 scripts/emit_trace.py

Endpoints:

Deployment Notes

Docker-only

docker compose down && docker compose up
python3 scripts/emit_trace.py

Local venv + Docker stack

python3 -m venv .venv
source .venv/bin/activate
pip install -e .
docker compose down && docker compose up
python3 scripts/emit_trace.py

Production-ish

  • Use a reachable OTLP endpoint (e.g., https://otel-collector.yourdomain:4317).
  • Enable TLS and supply CA certs where required.
  • Configure auth (mTLS or token) at the collector ingress.
  • Tune batching/timeouts for production traffic.

Troubleshooting

  • StatusCode.UNAVAILABLE: collector not reachable; confirm containers and ports 4317/4318.
  • No metrics on :8889: verify the Prometheus exporter is enabled and container restarted.
  • Missing otel-logs/logs.json: ensure otel-logs/ exists and restart collector.
  • Grafana 401s: login at http://localhost:3000 (admin/admin).

Project 3 Verdict

✅ Complete, validated, and ready for integration.

Clone this wiki locally