Add ORDER BY to JobSetStateIfRunningMany to try and fix intermittency#920
Merged
Merged
Conversation
bf81230 to
c712cab
Compare
…ency
Here, try to fix a test intermittency problem that I ran into while
trying to fix a different intermittency problem. Here's the error:
--- FAIL: TestDriverRiverDatabaseSQLPgx (0.01s)
driver_test.go:72: Reusing idle postgres schema "river_2025_05_22t23_56_07_schema_07" [user facing: "river_2025_05_22t23_56_07_schema_07"] after cleaning in 12.716292ms [25 generated] [506 reused]
driver_test.go:72: Driver does not support listener; skipping listener tests
--- FAIL: TestDriverRiverDatabaseSQLPgx/JobSetStateIfRunningMany_MultipleJobsAtOnce (0.00s)
riverdrivertest.go:2663: TestTx using postgres schema: river_2025_05_22t23_56_07_schema_14
riverdrivertest.go:2679:
Error Trace: /Users/brandur/Documents/projects/river/internal/riverinternaltest/riverdrivertest/riverdrivertest.go:2679
Error: Not equal:
expected: "completed"
actual : "retryable"
Diff:
--- Expected
+++ Actual
@@ -1,2 +1,2 @@
-(rivertype.JobState) (len=9) "completed"
+(rivertype.JobState) (len=9) "retryable"
Test: TestDriverRiverDatabaseSQLPgx/JobSetStateIfRunningMany_MultipleJobsAtOnce
riverdbtest.go:289: Checked in postgres schema "river_2025_05_22t23_56_07_schema_07"; 8 idle schema(s) [30 generated] [1000 reused]
FAIL
FAIL github.com/riverqueue/river 22.974s
FAIL
It stems from these lines in the driver tests:
jobsAfter, err := exec.JobSetStateIfRunningMany(ctx, setStateManyParams(
riverdriver.JobSetStateCompleted(job1.ID, now, []byte(`{"a":"b"}`)),
riverdriver.JobSetStateErrorRetryable(job2.ID, future, makeErrPayload(t, now), nil),
riverdriver.JobSetStateCancelled(job3.ID, now, makeErrPayload(t, now), nil),
))
require.NoError(t, err)
completedJob := jobsAfter[0]
require.Equal(t, rivertype.JobStateCompleted, completedJob.State)
I think the problem is that when selecting rows to return from
`JobSetStateIfRunningMany`. Usually these return in input order, but SQL
doesn't guarantee it, so occasionally jobs come back in the wrong order
and we get the `retryable` job before the `completed` job.
Here, we add an `ORDER BY` on the final `SELECT`. I only add it on one
side of the `UNION` which if I'm reading this correctly [1], should be
enough.
[1] https://www.postgresql.org/message-id/16814.1280268424%40sss.pgh.pa.us
c712cab to
18da744
Compare
bgentry
approved these changes
May 23, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here, try to fix a test intermittency problem that I ran into while
trying to fix a different intermittency problem. Here's the error:
It stems from these lines in the driver tests:
I think the problem is that when selecting rows to return from
JobSetStateIfRunningMany. Usually these return in input order, but SQLdoesn't guarantee it, so occasionally jobs come back in the wrong order
and we get the
retryablejob before thecompletedjob.Here, we add an
ORDER BYon the finalSELECT. I only add it on oneside of the
UNIONwhich if I'm reading this correctly [1], should beenough.
[1] https://www.postgresql.org/message-id/16814.1280268424%40sss.pgh.pa.us