Skip to content

Fix pipeline race condition: rotate all buffers by pipeline depth#53

Merged
stikves merged 2 commits into
apple:mainfrom
stikves:sukru/fix-pipeline-race-condition
Jun 18, 2026
Merged

Fix pipeline race condition: rotate all buffers by pipeline depth#53
stikves merged 2 commits into
apple:mainfrom
stikves:sukru/fix-pipeline-race-condition

Conversation

@stikves

@stikves stikves commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

With pipeline depth 3, the GPU sampler output and logits buffers were shared across in-flight stages, causing stale reads (repeated tokens) under CPU contention.

Introduce a shared pipelineDepth constant and rotate decodeOutputBuffers, decodeLogitsBuffers, and cachePositionBuffers so no two concurrent stages alias the same memory.

Fixes #46

…ple#46)

With pipeline depth 3, the GPU sampler output and logits buffers were
shared across in-flight stages, causing stale reads (repeated tokens)
under CPU contention. Introduce a shared pipelineDepth constant and
rotate decodeOutputBuffers, decodeLogitsBuffers, and cachePositionBuffers
so no two concurrent stages alias the same memory.
@stikves stikves force-pushed the sukru/fix-pipeline-race-condition branch from 0247a60 to 7e0bc95 Compare June 18, 2026 19:29
@stikves stikves marked this pull request as ready for review June 18, 2026 19:30
@stikves stikves self-assigned this Jun 18, 2026
@stikves stikves merged commit e358c84 into apple:main Jun 18, 2026
3 checks passed
@stikves stikves deleted the sukru/fix-pipeline-race-condition branch June 18, 2026 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential race condition in pipeline with CPU contention

2 participants