Language: English | 한국어
InferEdge is a local-first Edge AI inference validation portfolio. It connects build provenance, real runtime evidence, validation reports, optional deterministic diagnosis, and Lab-owned deployment decisions across separate repositories without turning the project into a production SaaS dashboard, cloud control plane, or generic monitoring stack.
The short version:
| Signal | Evidence |
|---|---|
| Deployability pipeline | Forge -> Runtime -> Lab -> optional AIGuard |
| Comparability layer | EdgeEnv local registry / comparability / runtime regression evidence |
| Operation layer | Orchestrator queue/deadline/fallback and worker-health evidence |
| Jetson TensorRT result | YOLOv8n TensorRT FP16: 10.066 ms mean, 15.548 ms p99, 99.34 FPS |
| CPU baseline | ONNX Runtime CPU: 45.430 ms mean, 49.213 ms p99, 22.01 FPS |
| Real device replay | Jetson Orin Nano ONNX replay: 155.86 ms mean, 45.5 C, 1000 MB RAM |
| Sustained operation smoke | 5-minute-class Jetson replay: 3600 frames, 50.375 C, 1038 MB RAM |
Clone the entrypoint and pinned Core 4 repositories:
git clone https://github.com/gwonxhj/InferEdge.git
cd InferEdge
bash scripts/clone_all.sh --lockedRun the local portfolio smoke:
bash scripts/smoke_all.shThat smoke checks Forge, Runtime, Lab, AIGuard, and the local-first Runtime Intelligence artifact chain. It validates reviewer-facing report markers and contract boundaries. Boundary marker: Production observability platform or GitLab control plane is out of scope.
InferEdge separates three questions that are often mixed together in inference projects:
Can we deploy this model? -> InferEdge validation layer
Can this benchmark evidence be trusted and compared? -> InferEdgeEnv comparability layer
Can deployed workloads stay stable under load? -> InferEdgeOrchestrator operation layer
ONNX Model
-> InferEdgeForge
-> InferEdge-Runtime
-> InferEdgeLab
-> optional InferEdgeAIGuard
-> Deployment Decision Report
-> Local Studio
Runtime Operation / Intelligence evidence extends the pipeline without replacing it:
InferEdgeOrchestrator operation context
-> InferEdgeEnv registry / comparability / regression evidence
-> optional InferEdgeAIGuard deterministic evidence
-> InferEdgeLab Runtime Intelligence Risk Summary
-> Lab-owned deployment decision
| Repository | Role | URL |
|---|---|---|
| InferEdgeForge | Build provenance, metadata, manifest, artifact handoff | https://github.com/gwonxhj/InferEdgeForge |
| InferEdge-Runtime | C++ execution, Lab-compatible result.json, Jetson/runtime result evidence |
https://github.com/gwonxhj/InferEdge-Runtime |
| InferEdgeLab | Compare/evaluate/report/API/Local Studio/deployment decision owner | https://github.com/gwonxhj/InferEdgeLab |
| InferEdgeAIGuard | Optional deterministic diagnosis evidence provider | https://github.com/gwonxhj/InferEdgeAIGuard |
| InferEdgeEnv | Local evidence registry, comparability checker, runtime regression owner | https://github.com/gwonxhj/InferEdgeEnv |
| InferEdgeOrchestrator | Runtime operation context provider for queue/deadline/fallback evidence | https://github.com/gwonxhj/InferEdgeOrchestrator |
| Evidence | Current record | Where to inspect |
|---|---|---|
| TensorRT Jetson FP16 | mean 10.066401 ms, p99 15.548438 ms, 99.340373 FPS | Local Studio demo evidence |
| ONNX Runtime CPU baseline | mean 45.4299 ms, p99 49.2128 ms, 22.0119 FPS | Local Studio demo evidence |
| TensorRT speedup | about 4.51x FPS over ONNX Runtime CPU | Local Studio demo evidence |
| YOLOv8 subset validation | 10 images, 89 boxes, simplified mAP@50 0.1410, precision 0.2941, recall 0.1685 | Lab evaluation evidence |
| Jetson device-local replay | 96 frames, 155.86 ms mean, 156.877 ms p95, max 45.5 C / 1000 MB RAM | Jetson Device-Local Agent Runtime Evidence Report (한국어: Jetson 디바이스 로컬 에이전트 런타임 증거 보고서) |
| Jetson 5-minute-class sustained replay | 3600 frames, Vision mean 152.77 ms, p95 156.948 ms, max 50.375 C / 1038 MB RAM | Jetson Device-Local 5-Minute Sustained Smoke Report (한국어: Jetson 디바이스 로컬 5분급 지속 스모크 보고서), HTML report (한국어: HTML 보고서) |
The Jetson records prove local evidence preservation and runtime-operation handoff. They do not claim decoded YOLO accuracy, live camera service, Whisper/FastAPI service execution, production remote execution, or thermal endurance validation.
| Area | Status | Reviewer signal |
|---|---|---|
| Core Forge -> Runtime -> Lab -> optional AIGuard validation pipeline | Implemented | Build provenance, Runtime result evidence, Lab compare/report/decision, optional deterministic AIGuard evidence |
| Local Studio demo evidence replay | Implemented | Local browser workflow for demo evidence, compare, deployment decision, and AIGuard cases |
| YOLOv8 COCO subset / model contract validation | Implemented | Subset evaluation plus bbox/score/contract validation |
| AIGuard diagnosis cases | Implemented | Deterministic bbox, score, baseline, temporal, and runtime-reliability warning evidence |
| Runtime Intelligence artifact gate | Implemented | Cross-repo smoke for the Orchestrator -> EdgeEnv -> AIGuard -> Lab bundle, including directly gated Jetson preservation and remote fallback Lab markers |
| Orchestrator producer-backed / device-local smoke | Smoke/Starter | Queue depth, drop/fallback, policy reason, Lab operation context, and EdgeEnv preservation evidence |
| Remote dispatch / fallback starter | Smoke/Starter | File-based worker selection, local HTTP fallback worker evidence, bounded fallback recovery, Lab-owned report context |
| Cloudflare / dashboard / production worker services | Future Work | Documented direction only |
The Runtime Intelligence artifact gate is a Cross-repo smoke that keeps the
Orchestrator -> EdgeEnv -> AIGuard -> Lab artifact chain readable and
reproducible. The Lab's local-first Runtime Intelligence artifact preserves
remote-dispatch boundary rows, Runtime replay duration scope, and compact
queue/deadline/fallback operation markers without making CI a runtime control
plane.
The smoke checks:
| Gate area | Reviewer-facing marker |
|---|---|
| Duration traceability | Validated Duration Traceability, duration_handoff_alignment, duration_source, duration_scope_label, runtime_intelligence_ci_artifact_gate_summary.md |
| Replay scope | Runtime replay duration scope, short 96-frame-class replay (96 frames), scope_label=source=entrypoint_requested_frames, Duration Comparison Summary |
| Jetson/device-local preservation | Lab EdgeEnv preservation context, lab_report_preservation_context_present=True, lab_preservation=present, identity=jetson_device_local_preservation, path=device_local_starter |
| Operation pressure | Reviewer operation quick scan, compact queue/deadline/fallback operation markers, Orchestrator queue/deadline/fallback markers, Queue pressure reasons, queue_pressure_reason, queue_pressure_reason=queue_backlog_threshold_exceeded, max_total_queue_depth, max_total_queue_depth=7, fallback_count, deadline_missed_count |
| AIGuard traceability | aiguard_raw_context: max_total_queue_depth traceability preserved, lab_expected_report_markers, lab_report_contract_context, aiguard_validates_expected_report_markers=false |
| Remote fallback | Remote fallback starter evidence, lab=Remote fallback starter evidence, remote_execution_recovered_by_fallback |
For the generated artifact list and the split between operation-smoke and
Runtime Intelligence smoke gates, see
docs/agent_runtime_e2e_demo.md
(한국어: 에이전트 런타임 e2e 데모 문서).
Run the Reliable Edge Agent Runtime extension smoke when the supporting Orchestrator repo is available in the same workspace:
bash scripts/demo_agent_runtime_e2e.sh
# Device-local starter path.
bash scripts/demo_agent_runtime_e2e.sh --device-local
# Preserve EdgeEnv local run evidence in the same bundle.
bash scripts/demo_agent_runtime_e2e.sh --device-local --edgeenv-run-evidence
# Remote dispatch starter evidence with bounded fallback.
bash scripts/demo_agent_runtime_e2e.sh --remote-dispatchFor repeat Jetson sustained runs, start with the readiness preflight:
bash scripts/check_jetson_sustained_readiness.sh
bash scripts/demo_jetson_5min_sustained.sh --edgeenv-run-evidencecheck_jetson_sustained_readiness.sh only checks SSH, tegrastats, repo
cleanliness, model availability, and EdgeEnv CLI availability. It does not
create new evidence. If the target Jetson is offline, keep using the committed
reports above instead of implying fresh Jetson runtime evidence.
For the clean replay procedure, see
Clean Jetson Replay Runbook
(한국어: 클린 Jetson 재현 런북).
Detailed ownership tables live in InferEdge Ecosystem 1-Page Summary (한국어: InferEdge 생태계 1페이지 요약) and Pipeline Map (한국어: 파이프라인 맵). The compact README boundary is:
| Project | Canonical owner role | Evidence it owns | Must not be treated as |
|---|---|---|---|
| InferEdgeForge | build provenance / handoff owner | metadata.json, manifest.json, source/artifact identity, build summary |
Runtime executor, scheduler, deployment decision owner |
| InferEdge-Runtime | execution / result evidence owner | Lab-compatible result.json, latency/FPS/backend/device context, runtime health and telemetry seeds |
Artifact builder, registry, anomaly detector, scheduler, deployment decision owner |
| InferEdgeLab | validation report / deployment decision owner | compare/evaluate output, Markdown/HTML reports, Local Studio, deployment_decision |
Build system, registry, deterministic diagnosis owner, scheduler, production dashboard |
| InferEdgeAIGuard | optional deterministic diagnosis evidence provider | guard_analysis, warning/review evidence, raw-context traceability |
Final deployment decision owner, LLM root-cause engine, production monitor |
| InferEdgeEnv | local evidence registry / comparability / runtime regression owner | run registry, replay bundle, comparability judgement, regression report | Production DB, cloud telemetry store, deployment decision owner, general monitoring SaaS |
| InferEdgeOrchestrator | runtime operation context provider | queue/deadline/fallback evidence, worker health, remote-dispatch starter evidence | Kubernetes replacement, cloud orchestration platform, deployability decision owner, completed production scheduler |
| Need | Document |
|---|---|
| Ecosystem diagram and layer split | InferEdge Ecosystem 1-Page Summary (한국어: InferEdge 생태계 1페이지 요약) |
| 30-second portfolio narrative | Portfolio Summary (한국어: 포트폴리오 요약) |
| Repository responsibilities and contract boundaries | Pipeline Map (한국어: 파이프라인 맵) |
| Agent Runtime / Runtime Operation smoke details | docs/agent_runtime_e2e_demo.md (한국어: 에이전트 런타임 e2e 데모 문서) |
| Interview-ready explanation | Interview Narrative (한국어: 인터뷰 내러티브) |
| Current Jetson device-local evidence | Jetson Device-Local Agent Runtime Evidence Report (한국어: Jetson 디바이스 로컬 에이전트 런타임 증거 보고서) |
| Current Jetson 5-minute-class evidence | Jetson Device-Local 5-Minute Sustained Smoke Report (한국어: Jetson 디바이스 로컬 5분급 지속 스모크 보고서), HTML report (한국어: HTML 보고서) |
Use this path when reviewing the ecosystem in Korean without losing the Validation -> Evidence -> Operation Control boundary.
| Step | Lifecycle question | Quick guide |
|---|---|---|
| 1 | How was the artifact built? | Forge agent manifest contract |
| 2 | How did Runtime record execution evidence? | Runtime agent result contract |
| 3 | Who owns the deployment decision? | Lab Korean README |
| 4 | What deterministic diagnosis evidence exists? | AIGuard detector validation matrix |
| 5 | Can benchmark evidence be trusted and compared? | EdgeEnv runtime regression monitor |
| 6 | Can deployed workloads stay stable under load? | Orchestrator operation control guide |
This review path does not change ownership: Lab remains the final deployment decision owner, EdgeEnv owns comparability/regression evidence, AIGuard owns deterministic diagnosis evidence, and Orchestrator owns runtime operation context.
| File | Purpose |
|---|---|
repos.lock |
Pinned Core 4 clone snapshot for Forge, Runtime, Lab, and AIGuard |
repos.yaml |
Supporting ecosystem references such as Orchestrator starter evidence |
scripts/clone_all.sh |
Clone pinned repositories into repos/ |
scripts/update_all.sh |
Pull all cloned repositories |
scripts/smoke_all.sh |
Run cross-repo portfolio smoke checks |
scripts/demo_agent_runtime_e2e.sh |
Generate local Agent Runtime evidence bundles |
scripts/check_jetson_sustained_readiness.sh |
Check Jetson readiness before repeat sustained evidence collection |
scripts/demo_jetson_5min_sustained.sh |
Convenience runner for repeat 5-minute-class Jetson sustained smoke |
InferEdge is a validation and runtime-operation evidence workflow, not a production SaaS dashboard, production observability platform, Kubernetes-style orchestration system, general monitoring SaaS, AI OS, or cloud control plane. The final deployment decision owner remains InferEdgeLab. AIGuard provides deterministic warning/diagnosis evidence, EdgeEnv owns local registry and comparability evidence, and Orchestrator owns bounded operation context rather than a completed production scheduler.