Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
f64e25d
add config-as-checkpoint, error classification, and per-trial writes …
orban Feb 24, 2026
4aae36a
default eval model to sonnet for reproducibility
orban Feb 24, 2026
b981018
bump default parallelism from 2 to 8 workers
orban Feb 24, 2026
d5bc415
add circuit breaker to skip remaining reps after repeated failures
orban Feb 24, 2026
a65cdde
add supervisor loop with auto-remediation for infra failures
orban Feb 24, 2026
825709f
add file-based control plane for external eval supervision
orban Feb 24, 2026
e23bc3b
add per-task Fisher analysis, recommendations, and eval monitor
orban Feb 24, 2026
8ca48a5
add AGENTbench native adapter with security and perf fixes
orban Feb 25, 2026
785c257
fix stale test references to renamed setup_workspace method
orban Feb 25, 2026
b53c614
add network parameter to run_in_docker, default to bridge
orban Feb 25, 2026
2d42eab
fix agentbench loader split and docker shell compatibility
orban Feb 25, 2026
9ab8632
use login shell in docker to pick up ~/.local/bin PATH
orban Feb 26, 2026
8d3b6ef
fix regression evaluation to match paper's delta-based logic, add rem…
orban Feb 27, 2026
d3ecadc
replace rsync with tar-over-SSH for remote docker workspace sync
orban Feb 27, 2026
2c9ea2e
pull docker images on remote host when EVAL_DOCKER_HOST is set
orban Feb 27, 2026
fb0e5b6
add SSH timeouts and sync-back excludes to docker runner
orban Mar 1, 2026
03c1ece
switch agentbench to persistent containers, fix tar overlay UID bug
orban Mar 2, 2026
a30b7aa
add throughput instrumentation, LPT scheduling, idle-timeout kill
orban Apr 20, 2026
e7071b6
fix kill races, LPT median bias, and silent reader-thread drops
orban Apr 20, 2026
250a20c
add knowledge-intake ingestion run report
orban May 23, 2026
ea7e99d
Add 2026-05-25 knowledge ingest report
orban May 25, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ CODEX_REVIEW.md
.serena/
# Nightshift plan artifacts (keep out of version control)
.nightshift-plan
.dmux/
57 changes: 57 additions & 0 deletions docs/reports/2026-05-23-knowledge-intake-ingestion-run.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Knowledge Intake Ingestion Run — 2026-05-23

Operational run of the `knowledge-intake` ingestion pipeline, capturing how many items each source returned.

## Command

```bash
cd ~/dev/knowledge-intake && source .env && bun run src/cli.ts ingest
```

- **Run date**: 2026-05-23
- **Repo**: `~/dev/knowledge-intake` (sibling to `intent-layer`)
- **Runtime**: `bun`
- **Exit code**: `0` (overall run succeeded; one source errored — see below)

## Per-source results

| Source | Items fetched | Status |
|------------|--------------:|--------|
| pinboard | 20 | OK |
| papers | 9 | OK |
| github | 100 | OK |
| email | 0 | Error — see below |
| **Total** | **129** | 3 of 4 sources succeeded |

The reported total of **129 new items** equals pinboard (20) + papers (9) + github (100). The `email` source contributed 0 items because it failed before fetching.

## Source-level error: email

The `email` source failed during a Gmail search via the `gog` CLI:

```
[email] error: gog gmail search failed (exit 1): gmail options: token source: get token for ryan.orban@gmail.com: read token: keyring connection timed out after 10s while reading keyring item (macOS Keychain may be waiting for a permission prompt; run `gog auth list` from a terminal and click "Always Allow" when prompted); set GOG_KEYRING_BACKEND=file and GOG_KEYRING_PASSWORD=<password> to use encrypted file storage instead
```

**Cause**: The macOS Keychain read for the Gmail OAuth token timed out after 10s. This happens when the Keychain is waiting on an interactive "Always Allow" permission prompt that never gets answered in a non-interactive run.

**Remediation** (either option):

1. Run `gog auth list` from an interactive terminal and click **Always Allow** when macOS prompts for Keychain access, then re-run the ingestion.
2. Switch `gog` to encrypted file-based token storage so no Keychain prompt is needed: set `GOG_KEYRING_BACKEND=file` and `GOG_KEYRING_PASSWORD=<password>` in the environment before running.

This is an environment/auth issue local to the run host, not a code defect in the ingestion pipeline. The other three sources fetched normally.

## Raw log

```text
$ cd ~/dev/knowledge-intake && source .env && bun run src/cli.ts ingest

[pinboard] fetched 20 new items
[papers] fetched 9 new items
[github] fetched 100 new items
[email] error: gog gmail search failed (exit 1): gmail options: token source: get token for ryan.orban@gmail.com: read token: keyring connection timed out after 10s while reading keyring item (macOS Keychain may be waiting for a permission prompt; run `gog auth list` from a terminal and click "Always Allow" when prompted); set GOG_KEYRING_BACKEND=file and GOG_KEYRING_PASSWORD=<password> to use encrypted file storage instead

Ingestion complete: 129 new items
EXIT_CODE=0
```
58 changes: 58 additions & 0 deletions docs/reports/2026-05-25-knowledge-intake-ingestion-run.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Knowledge Intake Ingestion Run — 2026-05-25

Operational run of the `knowledge-intake` ingestion pipeline, capturing how many items each source returned.

## Command

```bash
cd ~/dev/knowledge-intake && source .env && bun run src/cli.ts ingest
```

- **Run date**: 2026-05-25
- **Repo**: `~/dev/knowledge-intake` (sibling to `intent-layer`)
- **Runtime**: `bun`
- **Exit code**: `0` (overall run succeeded; one source errored — see below)

## Per-source results

| Source | Items fetched | Status |
|------------|--------------:|--------|
| pinboard | 0 | OK |
| papers | 0 | OK |
| github | 100 | OK |
| email | 0 | Error — see below |
| **Total** | **100** | 3 of 4 sources succeeded |

The reported total of **100 new items** equals pinboard (0) + papers (0) + github (100). The `email` source contributed 0 items because it failed before fetching.

## Source-level error: email

The `email` source failed during a Gmail search via the `gog` CLI:

```text
[email] error: gog gmail search failed (exit 1): gmail options: token source: get token for ryan.orban@gmail.com: read token: keyring connection timed out after 10s while reading keyring item (macOS Keychain may be waiting for a permission prompt; run `gog auth list` from a terminal and click "Always Allow" when prompted); set GOG_KEYRING_BACKEND=file and GOG_KEYRING_PASSWORD=<password> to use encrypted file storage instead
```

**Cause**: The macOS Keychain read for the Gmail OAuth token timed out after 10s. This happens when the Keychain is waiting on an interactive "Always Allow" permission prompt that never gets answered in a non-interactive run.

**Remediation** (either option):

1. Run `gog auth list` from an interactive terminal and click **Always Allow** when macOS prompts for Keychain access, then re-run the ingestion.
2. Switch `gog` to encrypted file-based token storage so no Keychain prompt is needed: set `GOG_KEYRING_BACKEND=file` and `GOG_KEYRING_PASSWORD=<password>` in the environment before running.

This is an environment/auth issue local to the run host, not a code defect in the ingestion pipeline. The other three sources fetched normally.

## Raw log

```text
$ cd ~/dev/knowledge-intake && source .env && bun run src/cli.ts ingest

[pinboard] fetched 0 new items
[papers] fetched 0 new items
[github] fetched 100 new items
[email] error: gog gmail search failed (exit 1): gmail options: token source: get token for ryan.orban@gmail.com: read token: keyring connection timed out after 10s while reading keyring item (macOS Keychain may be waiting for a permission prompt; run `gog auth list` from a terminal and click "Always Allow" when prompted); set GOG_KEYRING_BACKEND=file and GOG_KEYRING_PASSWORD=<password> to use encrypted file storage instead

Ingestion complete: 100 new items

EXIT_CODE=0
```
111 changes: 0 additions & 111 deletions eval-harness/.index-cache-preserve/ansible-flat_llm/AGENTS.md

This file was deleted.

86 changes: 0 additions & 86 deletions eval-harness/.index-cache-preserve/ansible-flat_llm/CLAUDE.md

This file was deleted.

71 changes: 0 additions & 71 deletions eval-harness/.index-cache-preserve/ansible-intent_layer/CLAUDE.md

This file was deleted.

Loading