fix(test): defeat Linear API's transient "conflict on insert" flake by flipbit03 · Pull Request #144 · flipbit03/lineark

flipbit03 · 2026-04-17T13:24:00Z

Problem

CI's Tests (Online) job had been flaking on *Create mutations for several PRs running. Linear's GraphQL API has structural consistency issues that aren't transient (despite looking like flakes): *Create mutations sometimes return "Entity X with id <UUID> already exists" for a UUID that doesn't actually exist server-side, soft-deleted resources accumulate as zombies counting against free-plan limits, and freshly-created resources may not be queryable for several seconds.

Same pattern hit PR #140's post-merge CI before the work in #141 even started — confirming the issue is structural, not introduced by recent work.

Approach

Treat Linear's quirks as the environment we're testing against, not bugs to wait out. Codify three mandatory pillars for every online test (added to CLAUDE.md) and retrofit the entire suite to follow them.

The three pillars

1. Proactive cleanup. Never assume the workspace is empty; never leak resources between tests.

cleanup_workspace in lineark-test-utils had a real bug: it queried client.X().first(250).send() without include_archived(true), so soft-deleted projects/issues/documents from prior runs were silently invisible. We discovered 24 zombie teams + 10 stuck-in-trash projects accumulated this way. Fixed.
Every test uses create_test_team() (calls cleanup_zombies() once per process + assigns a unique team key) and wraps every created resource in the matching RAII *Guard immediately after the create returns. Guards delete on drop so cleanup happens even when later asserts panic.
The cleanup-test-workspace binary now iterates every pooled workspace (not just the random one it draws), so any workspace silently accumulating zombies because it kept losing the random draw still gets tidied.

2. Unique names per attempt. Linear's API returns phantom "conflict on insert" errors keyed on request body content, so retries with identical bodies hit the same sticky conflict.

Generate a per-test random suffix: format!("[test] <what> {}", &uuid::Uuid::new_v4().to_string()[..8]).
6 SDK tests had hardcoded titles like "[test] SDK issue_create_and_delete" — no UUID suffix. Two test runs would collide. Added suffixes.
run_lineark_with_retry mutates the <name> positional between attempts (appends retry-<uuid6>) so each retry sends a different body. Mutation is gated on [test] prefix so comments create <uuid> and relations create <uuid> (which take UUID positionals) aren't corrupted.
Returns (Output, String) so callers can shadow their unique_name with the actually-used name for downstream read-by-name lookups.

3. Retries on every transient-capable call.

32 CLI tests + the create_two_issues helper called lineark <resource> create directly via lineark().args(...).output(), bypassing run_lineark_with_retry. Converted all 30+ sites.
run_lineark_with_retry bumped to 15 attempts with [0, 2, 5, 10, 20, 30, 60×9]s backoffs (~9 min worst case) — persistent enough to ride out Linear's cold window without needing a suite-level wrapper.
retry_create in lineark-test-utils (used by SDK tests) bumped from 3 → 15 attempts to match.
Suite-level retry removed. The previous version wrapped the whole online suite in a 3x retry — wasteful (re-ran 60+ passing tests because 1 failed) and counter-productive (suite-level retries are no longer needed when per-call retries are persistent). Replaced with single-shot suite invocation.
continue-on-error: true from an earlier emergency patch is also gone.

Multi-workspace token rotation

~/.linear_api_token_test and the LINEAR_TEST_TOKEN GitHub secret now accept a ;-separated pool of API tokens — one per workspace. test_token() picks one at random per test process and pins it for the process lifetime via OnceLock. Across runs the load distributes ~uniformly across N workspaces, spreading pressure on Linear's free-plan resource limits and trash-retention quirks.

Format details (with unit tests):

; is the primary separator; newlines also work.
Trailing ; is harmless; #-prefixed comment lines are dropped.
Comment filtering happens before ;-splitting, so a ; inside a comment doesn't accidentally produce fake "tokens".
Single-line single-token files (the original format) work unchanged.
test_token() logs using workspace N/total (token …<last6>) at process start so a CI-failure triage can instantly identify which workspace was hit.

This PR ships configured with 3 fresh workspaces (cadu-test-2/3/4). The previous workspace (cadus-test-workspace) had 10 trashed projects stuck on Linear's 30-day retention cycle — Linear's API has no permanent-delete for projects, so we abandoned it for fresh workspaces with no stuck history.

Test plan

cargo fmt && cargo clippy --workspace --all-targets -- -D warnings clean
make test — 235 offline tests pass (this is test-only code; offline suite is unaffected by design)
make test-online — 107 online tests pass locally (67 CLI + 40 SDK)
CI all 8 checks green: Lint, Tests (Offline), Tests (Online), 5 Build targets
CI logs show all 3 pooled workspaces being visited per run (cleanup hits all 3, test processes pick at random)
No serde_json::Value::Null injection or hand-written mutation(...) strings remain under crates/lineark/src/commands/

CI's Tests (Online) job has been flaking on `projects create` for several PRs. Linear's API has a transient failure mode where `*Create` mutations return "Entity X with id <UUID> already exists" for a UUID that doesn't actually exist server-side (verified by `read` immediately returning "Entity not found"). The same pattern hit PR #140's post-merge CI before the work in #141 even started. Local probing shows a clear cold-start signature: from a fresh process, the first ~3 attempts at `projects create` fail back-to-back, then it abruptly starts working (next 7 succeed in a row). The previous helper's 3-retry, 14s-backoff budget routinely runs out before the cold window closes. Layered fix: 1. **`run_lineark_with_retry`** (in tests/online.rs): - Up to 8 attempts with `[0, 2, 5, 10, 20, 30, 45, 60]`s backoff (~172s worst case) so we ride out the cold window. - Mutate the request body each retry by appending a fresh suffix to the `<name>` positional. Linear's stuck UUIDs are keyed on body content, so a different body avoids the cached state. - Returns `(Output, String)` where the second element is the name actually used. Callers shadow their `unique_name` with this so downstream `read by name` lookups match server state. 2. **CI workflow + Makefile** (`ci.yml`, `ci-online-fork.yml`, `Makefile`): wrap the whole online suite in a 3x retry with a workspace clean between attempts. Belt-and-suspenders for the occasional case where even the per-call helper's budget isn't enough.

Linear's GraphQL API is returning spurious "conflict on insert" errors and occasional HTTP 502s on `projects create` today — reproducible with a single ad-hoc probe, not just under CI load. No amount of client-side retry can beat a degraded upstream, and the suite-level 3x retry in the previous commit still wasn't enough. Flip the step to `continue-on-error: true` so the online suite still runs (failures stay visible in the job log for triage) but doesn't gate the PR check. Revisit once Linear's API stabilizes, or move online tests to a separate non-required workflow.

… API Linear's GraphQL API has permanent consistency issues — phantom "conflict on insert" errors on creates, eventually-consistent search indexes, occasional 502s. These aren't transient; they're part of the environment. Online tests must be written defensively around them or they'll flake in CI regardless of our code. Replace the wall-of-text paragraph with a structured "three pillars" section that names the mandatory patterns and points at the helpers that implement each one: 1. Proactive cleanup — cleanup_zombies + RAII guards + create_test_team. 2. Unique names per attempt — UUID-suffixed names; retry helpers MUST mutate the body between attempts (run_lineark_with_retry appends a fresh suffix and returns the actually-used name). 3. Retries on every transient-capable call — retry_create for mutations, retry_search / retry_with_backoff for reads-after-create, settle() for propagation waits. Note the CI-level 3x suite retry as belt-and-suspenders, not a substitute. Also pins down the "check exit before parsing output" and "no-production-tokens" rules that were implicit.

Audit-driven sweep of every online test against the cleanup / unique-name / retry pillars now codified in CLAUDE.md. Result: 100% pass on a fresh workspace (67 CLI + 40 SDK = 107 tests, ~5 min runtime). Pillar 1 (proactive cleanup): - `cleanup_workspace` was silently missing archived/trashed resources for issues, documents, projects, labels, and teams — the list calls didn't pass `include_archived(true)`. That left zombies accumulating across runs (we found 10 stuck-in-trash projects + 24 zombie teams in a single workspace). Fix: pass `include_archived(true)` on every list, and document why in the helper. Pillar 2 (unique names per attempt): - 6 SDK tests had hardcoded titles like `"[test] SDK issue_create_and_delete"` with no UUID suffix. Two test runs would collide. Added `Uuid::new_v4()[..8]` suffixes to each. Pillar 3 (retries on every transient-capable call): - 32 CLI tests + the `create_two_issues` helper called `lineark create` directly via `lineark().args(...).output()`, bypassing `run_lineark_with_retry`. Converted all 30+ sites; preserved `unique_name` shadowing where downstream lookups need the actually-used name. - `run_lineark_with_retry` bumped from 8 → 15 attempts with backoffs `[0,2,5,10,20,30,60×9]` (~9 min worst case). Body mutation gated on the `[test]` prefix so `comments create <uuid>` and `relations create <uuid>` positionals aren't corrupted. - `retry_create` in `lineark-test-utils` bumped from 3 → 15 attempts to match the CLI helper. Per-test persistence is the right pattern; re-running the whole suite for one bad test wastes cycles. Suite-level retry removed: - The CI workflows + Makefile previously wrapped the suite in a 3x retry + `continue-on-error: true`. Both gone. Per-call retries are persistent enough that any test failure is a real regression worth blocking on.

`~/.linear_api_token_test` (and the `LINEAR_TEST_TOKEN` GitHub secret) now accepts a `;`-separated pool of API tokens — one per workspace. `test_token()` picks one at random per test process and pins it for the process lifetime via `OnceLock`, so a single test run stays consistent while consecutive runs spread across the pool. Why per-process and not per-test: a single test session creates teams + projects + issues that reference each other; mid-run token swap would orphan resources. The load-spreading benefit comes from the *number of runs over time*, not granularity within one run. Format details: - `;` is the primary separator. Newlines also separate (so single-line and one-per-line files both work). - Trailing `;` is harmless (`tok;` parses as one token). - Lines whose first non-whitespace is `#` are comments. Important: comment filtering happens *before* `;`-splitting, so a `;` inside a comment doesn't accidentally produce fake "tokens" — caught with a unit test. - Backwards compatible: a single-line single-token file still works. Logs the chosen index/N and the last 6 chars of the token at process start, so a CI-failure triage can identify which workspace was hit without dumping full secrets. 7 unit tests cover the parser. Verified locally: 100% green online suite (67 CLI + 40 SDK) with 3-workspace pool active.

Previously the cleanup binary called `test_token()` and cleaned only the one randomly-chosen workspace. With token rotation in place, that meant the explicit CI cleanup step ran on a *different* workspace than the test process drew — useless ceremony. The other workspaces silently accumulated zombies until they happened to lose three random draws in a row and the in-process `cleanup_zombies()` finally caught up. New behaviour: the binary loads every token from the pool via `all_test_tokens()` (newly exported from `lineark-test-utils`) and calls `cleanup_workspace` against each, with per-workspace failures isolated so one bad workspace doesn't block the rest. Logs a `cleanup: workspace N/total (token …<last6>)` line per pass — same triage tag the `test_token` log uses, so a failure lines up visually with which workspace the actual tests then picked. Verified locally: 100% green online suite (67 CLI + 40 SDK), three cleanup messages in the log show all pooled workspaces visited before the test phase began.

Earlier in this PR I added the three-pillars CLAUDE.md section pointing at the helpers as they existed at that commit. Subsequent commits in the same PR bumped retry counts (3→15 for retry_create, 8→15 for the CLI helper), removed the suite-level 3x retry, and removed `continue-on-error: true` — but CLAUDE.md was never refreshed and ended up advertising state that no longer exists. Caught by an independent review. Updates: - Pillar 3: corrected attempt counts and backoff schedule for both helpers; called out that body mutation is gated on the `[test]` prefix (so UUID-positional commands like `comments create <uuid>` / `relations create <uuid>` aren't corrupted on retry). - "CI-level belt-and-suspenders" bullet replaced with "No suite-level retry" — explains why and notes the historical context. - New bullet under "Other online-test rules" documenting the `;`-separated multi-workspace token pool, `OnceLock` per-process pin, and the `cleanup-test-workspace` binary's whole-pool sweep.

flipbit03 force-pushed the fix/online-test-conflict-recovery branch from 6c5704c to d47de18 Compare April 17, 2026 14:02

flipbit03 force-pushed the fix/online-test-conflict-recovery branch from d47de18 to 5d23aa8 Compare April 17, 2026 14:03

flipbit03 changed the title ~~fix(test): recover after Linear's "conflict on insert" on projects create~~ fix(test): defeat Linear API's transient "conflict on insert" flake Apr 17, 2026

flipbit03 added 7 commits April 17, 2026 12:05

fix(fmt): apply rustfmt to retry.rs eprintln

f4ea678

flipbit03 merged commit d025801 into main Apr 18, 2026
8 checks passed

flipbit03 deleted the fix/online-test-conflict-recovery branch April 18, 2026 00:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(test): defeat Linear API's transient "conflict on insert" flake#144

fix(test): defeat Linear API's transient "conflict on insert" flake#144
flipbit03 merged 8 commits into
mainfrom
fix/online-test-conflict-recovery

flipbit03 commented Apr 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

flipbit03 commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Approach

The three pillars

Multi-workspace token rotation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

flipbit03 commented Apr 17, 2026 •

edited

Loading