Skip to content

External ingest: harden finalize against 412 phase mismatch#322254

Open
bhavyaus wants to merge 3 commits into
mainfrom
dev/bhavyau/external-ingest-finalize-412
Open

External ingest: harden finalize against 412 phase mismatch#322254
bhavyaus wants to merge 3 commits into
mainfrom
dev/bhavyau/external-ingest-finalize-412

Conversation

@bhavyaus

@bhavyaus bhavyaus commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Fixes the "GitHub Copilot: Build Codebase Semantic Index" failure reported in #320915, where, for non-GitHub repositories, document upload succeeds but POST /external/code/ingest/finalize returns HTTP 412 with request sent for phase which was not the current phase.

Root cause

The external-ingest protocol is phase-gated server-side: create checkpoint → reconcile coded symbols → upload documents → finalize. The server only advances to the finalize phase once it has received every expected document. The client previously:

  • called finalize exactly once with no reconciliation, and
  • swallowed per-document upload failures (logging them but counting them as uploaded), so a stranded server surfaced later as a confusing finalize 412 rather than the real cause.

Fixes #320915

Copilot AI review requested due to automatic review settings June 21, 2026 02:06
@bhavyaus bhavyaus enabled auto-merge (squash) June 21, 2026 02:08

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the Copilot external-ingest indexing flow against server-side phase gating, specifically addressing cases where POST /external/code/ingest/finalize returns HTTP 412 (“phase mismatch”) after document upload appears to have succeeded. It improves correctness and diagnosability by reconciling missing documents and surfacing real upload failures earlier, and it adds protocol-level tests using a mock GitHub fetcher.

Changes:

  • Add a bounded finalize retry loop that re-polls /external/code/ingest/batch, re-uploads still-missing documents, backs off briefly, and retries finalize.
  • Stop counting failed document uploads as successful; instead fail the pass with a descriptive error (including status and requestId when available), and abort deterministically on 404 (“ingest gone”).
  • Add an ExternalIngestClient test suite that exercises the real client against a mock IGithubApiFetcherService for retries, re-upload behavior, retry exhaustion, and error surfacing.
Show a summary per file
File Description
extensions/copilot/src/platform/workspaceChunkSearch/node/codeSearch/externalIngestClient.ts Implements finalize 412 reconciliation/retry and makes document-upload failures fail the ingestion pass deterministically.
extensions/copilot/src/platform/workspaceChunkSearch/test/node/externalIngest.spec.ts Adds end-to-end-ish protocol tests for finalize retries and upload failure behavior using a recording mock fetcher.

Copilot's findings

Comments suppressed due to low confidence (1)

extensions/copilot/src/platform/workspaceChunkSearch/node/codeSearch/externalIngestClient.ts:473

  • The upload task throws when the server requests a docSha that isn't present in mappings, but that rejection is later swallowed by the outer Promise.all(uploading) error handler, letting the pass continue and potentially turning a client-side mapping mismatch into a confusing finalize 412/retry flow. Treat this as a recorded failure so the pass fails deterministically with a descriptive error.
							const fileEntry = mappings.get(requestedDocSha);
							if (!fileEntry) {
								throw new Error(`No mapping for docSha: ${requestedDocSha}`);
							}
  • Files reviewed: 2/2 changed files
  • Comments generated: 1

@bhavyaus bhavyaus marked this pull request as draft June 21, 2026 02:20
auto-merge was automatically disabled June 21, 2026 02:20

Pull request was converted to draft

@bhavyaus bhavyaus marked this pull request as ready for review June 21, 2026 04:57
@bhavyaus bhavyaus requested a review from Copilot June 21, 2026 04:57
@bhavyaus bhavyaus enabled auto-merge (squash) June 21, 2026 04:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

bhavyaus added 2 commits June 20, 2026 22:30
Fixes the "Build Codebase Semantic Index" failure for non-GitHub repos
where `/external/code/ingest/finalize` returns HTTP 412 ("request sent
for phase which was not the current phase"). The server only advances to
the finalize phase once it has received every expected document, so a
finalize can fail when the server is still missing documents.

- Retry finalize with backoff (bounded) after a 412, re-polling `/batch`
  for documents the server still reports as missing and re-uploading them.
- Stop swallowing per-document upload failures: collect them and fail the
  pass with a descriptive error (status + requestId) before finalize, so
  the real cause is surfaced instead of a misleading finalize 412. Failed
  uploads are no longer counted as uploaded.
- Make a `/document` 404 ("ingest is gone") abort the pass deterministically
  rather than relying on a rejection an intermediate `Promise.all` can swallow.
- When finalize keeps returning 412 but the server reports no missing
  documents, surface a distinct "likely server-side" error after retries.

Adds an ExternalIngestClient test suite driving the real client through the
ingest HTTP protocol (retry-then-success, re-upload still-missing, exhaustion,
server-side-no-missing-docs, persistent upload failure, and 404 abort).

Refs #320915
- Test suite uses fake timers so finalize-retry backoff doesn't add real wall-clock time
- Record unmapped docSha as a deterministic upload failure instead of throwing (rejection was swallowed by Promise.all)
- updatedFileCount counts unique uploaded docShas across passes instead of summing per-pass attempts
- Add test for the unmapped-docSha failure path
@bhavyaus bhavyaus force-pushed the dev/bhavyau/external-ingest-finalize-412 branch from 46d97fe to ec99d2f Compare June 21, 2026 05:30
@bhavyaus bhavyaus marked this pull request as draft June 21, 2026 05:48
auto-merge was automatically disabled June 21, 2026 05:48

Pull request was converted to draft

createPlatformServices() wires a static auth service backed by createStaticGitHubTokenProvider(), which throws when no GITHUB_PAT/GITHUB_OAUTH_TOKEN is set (CI). Override IAuthenticationService with a static token so getAuthToken() resolves deterministically. Verified by running the suite with the token env vars unset.
@bhavyaus bhavyaus marked this pull request as ready for review June 21, 2026 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GitHub Copilot: Semantic index finalize fails with 412 for non-GitHub repository even though external ingest is allowed by organization policy

2 participants