Skip to content

fix(external): don't list AIMA's own deployment backend (llama.cpp :8080) as an external service#83

Open
rjckkkkk wants to merge 2 commits into
developfrom
feat/external-skip-own-deploy
Open

fix(external): don't list AIMA's own deployment backend (llama.cpp :8080) as an external service#83
rjckkkkk wants to merge 2 commits into
developfrom
feat/external-skip-own-deploy

Conversation

@rjckkkkk

@rjckkkkk rjckkkkk commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Problem

The external-service scanner probes a fixed port list (internal/external/service.go: 8000-8010, 8080, 7860, 5000, 5001, 3000). 8080 is llama.cpp's default port, which an AIMA native deployment also binds. So a model AIMA deployed itself gets surfaced again under "外部服务 / external services" as importable.

Observed on a box where AIMA deployed Qwen3.6-27B-Q4_K_M:

deploy.list   : Qwen3.6-27B-Q4_K_M  runtime=native  address=127.0.0.1:8080  running
external.list : base_url=http://127.0.0.1:8080  source=scan  imported=false  models=["Qwen3.6-27B-Q4_K_M.gguf"]

Same backend, listed in both panels. Worse than cosmetic: clicking import would register a self-referential external-openai backend pointing at AIMA's own deployment.

Fix

Reconciler already holds the proxy, which knows its own (non-external) deployment backends. Exclude any scanned/persisted service whose host:port matches one:

  • normalizeHostPort — reduce a base URL/address to a comparable host:port, folding localhost / ::1 / 0.0.0.0127.0.0.1.
  • Reconciler.ownDeploymentAddrs — the host:port set of proxy backends where External == false.
  • Scan skips own-deployment addresses (no upsert); List filters them out, so a self entry a prior scan already persisted also disappears.

Genuine external services (e.g. Ollama on :11434, LM Studio, a manually-started llama.cpp on a different port) are unaffected.

Tests

  • normalizeHostPort table (scheme/path strip, loopback folding).
  • List excludes an own :8080 deployment while keeping a genuine Ollama :11434 service (real in-memory DB + proxy with a non-external backend).

go test ./..., go build ./..., go vet, gofmt clean. Verified on real hardware: after the fix, external.list returns the AIMA-deployed :8080 no longer, while deploy.list still shows it running.

Codex and others added 2 commits June 9, 2026 15:09
…services

The external-service scanner probes a fixed port list that includes 8080 —
llama.cpp's default port — which AIMA's own native deployments also bind. So a
model AIMA deployed itself (e.g. Qwen3.6-27B-Q4_K_M on 127.0.0.1:8080) was also
surfaced under "external services" as importable, and importing it would
register a self-referential backend.

Exclude any scanned/persisted service whose host:port matches an AIMA-owned
(non-external) proxy backend:

- normalizeHostPort: reduce a base URL/address to a comparable host:port,
  folding localhost/::1/0.0.0.0 to 127.0.0.1.
- Reconciler.ownDeploymentAddrs: host:port set of non-external proxy backends.
- Scan skips own-deployment addresses (no upsert); List filters them out so a
  previously-recorded self entry disappears too.

Tests: normalizeHostPort table; List excludes the own :8080 deployment while a
genuine external service (ollama :11434) is kept.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A scanned (auto-discovered) external service was upserted when found but never
reconciled away when it disappeared. So after a model was undeployed and its
backend (e.g. native llama.cpp on :8080) stopped, the dead service lingered as a
stale "reachable" row and kept showing under external services.

On each scan, mark previously-discovered scanned (non-imported, non-own) services
that are no longer reachable as unreachable; List now hides scanned, non-imported,
unreachable services. Imported services are untouched (they still show, with their
status, by user intent).

Tests: staleScannedAddrs (skips imported / already-unreachable / own deployments);
List hides a vanished scanned :8080 while keeping a reachable :11434 and an offline
imported :9000.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rjckkkkk added a commit that referenced this pull request Jun 10, 2026
The dist exe filename now carries its version string so a new build does NOT
overwrite the previous one in place. Two builds coexist:

  aima-windows-amd64-v0.5-dev-amd-strix-halo-20260610.exe  -> v0.5-dev-amd-strix-halo-20260610
      latest; adds the out-of-box AMD-HIP llama.cpp engine (#85, #86). serve.bat uses this.
  aima-windows-amd64-v0.5-dev-amd-strix-halo.exe           -> v0.5-dev-amd-strix-halo
      restored 2026-06-09 build (#78-#83 only, no HIP auto-download). Kept for rollback.

Both from source commit fc3ef41; filename == the exe's own `aima version` string.

- #85 llamacpp-hip-windows engine asset (go:embed'd into the exe + in source at
  catalog/engines/llamacpp-hip-windows.yaml, pins official b9330 win-hip-radeon-x64):
  a no-NVIDIA Strix Halo box auto-downloads the right ROCm/HIP llama.cpp instead of
  the CPU-only CUDA universal source.
- #86 native runtime resolves the engine binary against AIMA_ENGINE_DIR
  (dist -> AIMA_ENGINE_DIR -> auto-download -> PATH), so a pre-installed llama.cpp
  of ANY version is launchable -- the partner's llama.cpp version is supported
  whether or not it matches the bundled b9330.

Verified the latest build on the 395 rig: `aima version` ->
v0.5-dev-amd-strix-halo-20260610 / fc3ef41; `aima engine info llamacpp-hip-windows`
-> b9330 asset; `aima hal detect` -> RDNA3.5 gfx1151, ~110 GB VRAM.
README + serve.bat updated (build table; AIMA_ENGINE_DIR now optional).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant