Skip to content

Materialise FK lists in ImportMetadataViewset to avoid mis-planned subqueries#14770

Open
rtibbles wants to merge 1 commit into
learningequality:developfrom
rtibbles:fix-importmetadata-postgres-hang
Open

Materialise FK lists in ImportMetadataViewset to avoid mis-planned subqueries#14770
rtibbles wants to merge 1 commit into
learningequality:developfrom
rtibbles:fix-importmetadata-postgres-hang

Conversation

@rtibbles
Copy link
Copy Markdown
Member

@rtibbles rtibbles commented May 28, 2026

Summary

  • __in=<queryset> in ImportMetadataViewset._serialize gave postgres correlated subqueries it mis-planned on fresh test DBs.
  • Hung the postgres CI job ~100 min on ~30% of runs.
  • Likely surfaced now because import metadata was recently extended to recurse through courses to transfer the full metadata.
  • Switch to materialised lists with the existing __uuidin lookup (inlines IDs as SQL literals).

References

  • Postgres CI hang observed intermittently across multiple recent PRs.

  • Investigation + fault-handler instrumentation temporarily on: PR Remove task-worker multiprocessing #14754 (now removed). Captured stack (CI run):

    kolibri/core/content/test/test_public_api.py Timeout (0:02:00)!
    Thread 0x... (most recent call first):
      File ".../django/db/backends/utils.py", line 84 in _execute
      File ".../kolibri/core/content/public_api.py", line 134 in _serialize
      File ".../kolibri/core/content/public_api.py", line 180 in retrieve
      ...
      File ".../test_public_api.py", line 70 in test_import_metadata_assessmentmetadata
    

    Same stack across 11 consecutive 120-s dumps over ~2 h before the suite recovered.

Reviewer guidance

  • test_public_api.py exercises every _serialize query against postgres — that suite plus the existing channel-import flow is the primary check.
  • Empty-input cases for the __uuidin filters (e.g. a tag-less channel) flow through the EmptyResultSet branch — worth a glance at the loop's try/except.
  • Lang-code dedupe replaces the prior Q(...) | Q(...) OR-subquery — confirm None is filtered and no codes are lost.

AI usage

Used Claude Code for investigation, diagnosis (fault-handler instrumentation + CI stack capture), and drafting the patch. I guided the lookup choice (__uuidin), reviewed the diff and drove revisions, and confirmed test_public_api.py passes against postgres locally.

@github-actions github-actions Bot added DEV: backend Python, databases, networking, filesystem... SIZE: small labels May 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

@rtibbles rtibbles marked this pull request as ready for review May 28, 2026 14:41
Replace `__in=<queryset>` with materialised lists and the existing `__uuidin` lookup (which inlines IDs as SQL literals) for UUID-keyed columns. The subquery form gave postgres correlated subqueries it sometimes mis-planned on a fresh test database, producing multi-hour query times in CI. `__uuidin` also sidesteps SQLITE_MAX_VARIABLE_NUMBER. Language codes are dedup'd to a small set and use a plain literal IN. Catch EmptyResultSet from `__uuidin=[]` in the serialization loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rtibbles rtibbles force-pushed the fix-importmetadata-postgres-hang branch from 86cbf90 to 187330e Compare May 28, 2026 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DEV: backend Python, databases, networking, filesystem... SIZE: small

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant