Skip to content

Add serializer-derived field introspection to ValuesViewset#14327

Open
rtibbles wants to merge 3 commits into
learningequality:developfrom
rtibbles:viewset
Open

Add serializer-derived field introspection to ValuesViewset#14327
rtibbles wants to merge 3 commits into
learningequality:developfrom
rtibbles:viewset

Conversation

@rtibbles
Copy link
Copy Markdown
Member

@rtibbles rtibbles commented Mar 4, 2026

Summary

Enables ValuesViewset to automatically derive values(), field_map, and many=True consolidation from serializer_class — no more manually maintained field tuples that drift out of sync.

  • Serializer introspection module and 58-test suite for the core derivation logic
  • Adds ValuesMethodField to allow ValuesViewset compatible SerializerMethodField
  • Benchmark script for measuring serialization performance and detecting regressions
  • Migrates FacilityUserViewSet, PublicFacilityUserViewSet, and FacilityUserSignUpViewSet as first validation (all 166 existing auth tests pass unchanged)

References

Partial work towards #14036

Reviewer guidance

  • Start with kolibri/core/utils/serializer_introspection.py — the core derivation logic
  • Then BaseValuesViewset._ensure_initialized in kolibri/core/api.py — where derivation is triggered and cached
  • The auth migration (kolibri/core/auth/api.py) shows the pattern in practice
  • Risky areas: _auto_consolidate (groupby dedup correctness), _field_matches_inferred_type (false negatives cause silent type mismatches)

AI usage

Developed collaboratively with Claude Code (Opus 4.6). Used for initial implementation, test suite, and iterative refinement based on benchmark results and review feedback. All code reviewed and verified against the existing auth test suite.

@github-actions github-actions Bot added DEV: backend Python, databases, networking, filesystem... SIZE: very large labels Mar 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 4, 2026

@rtibbles rtibbles requested a review from rtibblesbot March 6, 2026 03:43
Copy link
Copy Markdown
Contributor

@rtibblesbot rtibblesbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-structured first step toward serializer-derived field introspection for ValuesViewset. The core introspection module is solid, the FacilityUser viewset migrations are clean, and the 58-test suite plus benchmark script provide good coverage and regression tooling.

CI passing. No UI changes.

  • suggestion: deferred_fields semantics for non-nested-serializer fields (see inline)
  • suggestion: Thread safety comment precision (see inline)
  • praise: Ordering preservation in _auto_consolidate (see inline)
  • praise: Type-inference optimization in _field_matches_inferred_type (see inline)
  • praise: Removal of FacilityUserConsolidateMixin and its manual re-sort — the new auto-consolidate is more robust

@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Reviewed the pull request diff checking for:

  • Correctness: bugs, edge cases, undocumented behavior, resource leaks, hardcoded values
  • Design: unnecessary complexity, naming, readability, comment accuracy, redundant state
  • Architecture: duplicated concerns, minimal interfaces, composition over inheritance
  • Testing: behavior-based assertions, mocks only at hard boundaries, accurate coverage
  • Completeness: missing dependencies, unupdated usages, i18n, accessibility, security
  • Principles: DRY (same reason to change), SRP, Rule of Three (no premature abstraction)
  • Checked CI status and linked issue acceptance criteria
  • For UI changes: inspected screenshots for layout, visual completeness, and consistency

Comment thread kolibri/core/auth/api.py Outdated
field_map = {
"is_superuser": lambda x: bool(x.pop("devicepermissions__is_superuser")),
}
deferred_fields = ("roles",)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: deferred_fields = ("roles",) here works correctly — the field is skipped during introspection and populated manually in consolidate(). However, roles on PublicFacilityUserSerializer is a ListField(child=CharField()), not a nested ModelSerializer. The deferred_fields docstring and documentation describe it as "nested serializer field names that should be fetched separately," so this usage is semantically imprecise.

This won't cause bugs (the introspection code handles it: ListField fails _is_nested_serializer(), hits the field_name in deferred_fields: continue branch, and gets skipped), but it could confuse future developers who expect deferred_fields entries to be usable with serialize_queryset().

Consider either:

  • Adding a brief comment here explaining that roles is deferred because it's populated manually in consolidate() (not via serialize_queryset()), or
  • Documenting in api_patterns.rst that deferred_fields can also be used for non-nested fields that are populated manually

Comment thread kolibri/core/api.py
Deferred from __init_subclass__ to avoid instantiating serializers
(which may reference querysets) at class definition / import time.

No lock needed under CPython's GIL: the work is idempotent —
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The thread-safety reasoning is correct in outcome but could be more precise. Under CPython's GIL, two threads can both read _initialized = False and execute the initialization block concurrently (the check-then-act at line 238 is not atomic across bytecodes). The code is safe because the work is idempotent — both threads produce identical results, so the final class state is correct regardless. The waste is only redundant computation, not corruption.

Consider rewording to: "Safe without a lock under CPython's GIL: concurrent threads may redundantly execute this block, but the work is idempotent — identical serializer always produces identical values/field_map, so the final state is correct regardless of execution order."

Comment thread kolibri/core/api.py
self.__class__._pk_getter = pk_getter

joined_many = self._joined_many
original_pk_order = list(dict.fromkeys(pk_getter(item) for item in items))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Capturing original_pk_order before sorting for groupby, then restoring it at line 616, is an elegant solution. This correctly preserves the queryset's ordering (applied by ValuesViewsetOrderingFilter or the default order_by) through the groupby consolidation — a significant improvement over the old FacilityUserConsolidateMixin which had to manually re-sort based on request query params.

current_model = related_model


def _field_matches_inferred_type(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Good design decision to use exact class matching (type(declared_field) is inferred_class) rather than isinstance. This correctly handles the case where a subclass overrides to_representation() — such fields will always go through the transform path, avoiding silent type mismatches. The FK-field handling (lines 92-100) is also well-reasoned: skipping to_representation for PrimaryKeyRelatedField avoids calling it with a raw value instead of a model instance.

return attrs


class FacilityUserRoleSerializer(serializers.ModelSerializer):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Clean separation of FacilityUserRoleSerializer (read-only, excludes user) from the existing RoleSerializer (includes user for write operations). This avoids leaking the parent user ID into nested role objects in API responses.

Copy link
Copy Markdown
Contributor

@rtibblesbot rtibblesbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid infrastructure for serializer-derived field introspection. The core introspection module, auto-consolidation, and FacilityUser viewset migrations are well-structured.

CI passing. No UI changes.

Delta from prior review:

  • ACKNOWLEDGED: deferred_fields semantics for non-nested-serializer fields — behavior is correct as-is (ListField in deferred_fields is simply skipped from values(), populated by consolidate())
  • ACKNOWLEDGED: Thread safety comment precision — the comment at _ensure_initialized accurately describes CPython GIL safety and free-threaded Python considerations
  • 3 prior praise finding(s) stand; not re-posted.

New findings:

  • blocking: _build_validation_schema crashes on ListField children in DEBUG mode (see inline)
  • praise: _get_own classmethod pattern (see inline)

@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Compared the current PR state against findings from a prior review:

  • Retrieved prior bot reviews deterministically via the GitHub API
  • Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
  • Only raised NEW findings for newly introduced code
  • Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
  • Checked CI status and linked issue acceptance criteria

Comment thread kolibri/core/api.py Outdated
if getattr(field, "write_only", False):
continue
expected_fields.add(field_name)
if hasattr(field, "child"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: _build_validation_schema crashes in DEBUG mode for serializers with ListField(child=...) fields.

hasattr(field, "child") matches both ListSerializer (intended — its .child is a Serializer with .fields) and ListField (unintended — its .child is a plain Field like CharField, which has no .fields attribute).

Concrete trigger: PublicFacilityUserSerializer.roles is ListField(child=CharField()). When PublicFacilityUserViewSet initializes in DEBUG mode, it calls _build_validation_schema, hits this branch, recurses into CharField, and CharField has no .fieldsAttributeError.

This isn't caught by tests because Django test settings typically have DEBUG=False, and the test suite's @override_settings(DEBUG=True) tests use custom serializers that don't include ListField.

Fix: guard the recursion to only enter serializer children, e.g.:

if hasattr(field, "child") and hasattr(field.child, "fields"):

or more explicitly:

from rest_framework.serializers import BaseSerializer
if hasattr(field, "child") and isinstance(field.child, BaseSerializer):

The same pattern should also be checked in _validate_items_against_schema — if a ListField value (list of primitives like ["admin", "coach"]) reaches the nested validation path, it would try to call .keys() on strings.

Comment thread kolibri/core/api.py
Prevents dynamically cached class attributes (e.g. serializer_class
set by get_serializer_class()) from leaking to child classes.
"""
return cls.__dict__.get(attr, default)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: _get_own via cls.__dict__.get() is a clean solution for preventing dynamically-cached class attributes from leaking through the MRO to child classes — a common and subtle pitfall with Django class-based views and DRF viewsets.

Copy link
Copy Markdown
Contributor

@rtibblesbot rtibblesbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-executed serializer-derived field introspection for ValuesViewset. The introspection module is clean, the FacilityUser viewset migrations are correct, and the 58-test suite provides solid coverage.

CI pending. No UI changes.

Delta from prior reviews:

  • RESOLVED: _build_validation_schema crashes on ListField children in DEBUG mode — fixed by adding isinstance(field.child, Serializer) guard at api.py:469, with a dedicated regression test at test_api.py:602
  • 2 prior praise findings stand; not re-posted.

New findings:

  • suggestion: map_fields tuple path silently swallows KeyError for missing source keys (see inline)
  • praise: Comprehensive test for _auto_consolidate ordering preservation (see inline)
  • praise: Clean separation of FacilityUserRoleSerializer from RoleSerializer (see inline)

@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Compared the current PR state against findings from a prior review:

  • Retrieved prior bot reviews deterministically via the GitHub API
  • Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
  • Only raised NEW findings for newly introduced code
  • Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
  • Checked CI status and linked issue acceptance criteria

for key, value in field_map.items():
if isinstance(value, tuple):
src, to_repr, default = value
raw = item.pop(src)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The tuple branch uses item.pop(src) which will raise KeyError if the source key is missing — consistent with the string branch. However, the None check on line 441 means that to_repr is never called for None values, which is good for preventing crashes from fields like PrimaryKeyRelatedField. Just noting that this is a deliberate design choice (confirmed by the TestTransformRenamNoneHandling tests), not an oversight.

Comment thread kolibri/core/test/test_api.py Outdated
self.assertEqual(result[1]["id"], "parent1")
self.assertEqual(result[1]["title"], "Mango")
self.assertEqual(result[2]["id"], "parent2")
self.assertEqual(result[2]["title"], "Apple")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Excellent test — verifying that _auto_consolidate preserves non-PK ordering after its internal sort-by-PK groupby is exactly the kind of behavioral edge case that would be easy to break during future refactors.

return attrs


class FacilityUserRoleSerializer(serializers.ModelSerializer):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Clean extraction of FacilityUserRoleSerializer that excludes the redundant user field for nested use. This makes the serializer self-documenting about what the API actually returns.

viewset = _make_viewset(viewset_class, queryset)
result = viewset.serialize(queryset)

normalized = normalize_for_comparison(result)
Copy link
Copy Markdown
Member

@marcellamaki marcellamaki Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be removed because the ordering of the output data matters. if we have changed the order, we have meaningfully changed the API (-- so says past Richard)

@rtibbles rtibbles changed the base branch from release-v0.19.x to develop April 9, 2026 21:41
@rtibbles rtibbles marked this pull request as ready for review April 9, 2026 21:42
Copy link
Copy Markdown
Contributor

@rtibblesbot rtibblesbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good cleanup progress — prior blocking and suggestion findings are all resolved.

CI: 1 check still in progress (Build Assets). No failures. No UI changes — Phase 3 skipped.

Prior findings

4 prior findings unchanged (4 resolved, 0 unaddressed).

All prior bot-review findings remain resolved since the last review. The map_fields KeyError suggestion from review 3 is already addressed by the code comment at serializer_introspection.py:482–484 explaining the fail-fast intent; not re-raising.


New findings (following up on marcellamaki's ordering concern):

  • blocking: PublicFacilityUserViewSet lost deterministic output ordering — see inline
  • suggestion: benchmark hash comparison treats ordering changes as WARN rather than FAIL — see inline

@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Compared the current PR state against findings from a prior review:

  • Retrieved prior bot reviews deterministically via the GitHub API
  • Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
  • Only raised NEW findings for newly introduced code
  • Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
  • Checked CI status and linked issue acceptance criteria

Comment thread kolibri/core/auth/api.py
c_hash = current["data"]["output_hash"]
data_match = b_hash == c_hash

overall_pass = time_pass and mem_pass and queries_pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: data_match is excluded from overall_pass, so a run that produces different output (including different ordering) still reports PASS overall, with only a WARN on the Data output row.

Marcellamaki's comment on this file flagged exactly this: ordering changes are meaningful API changes, so a hash mismatch should not silently pass. Either include data_match in overall_pass, or remove the hash check to avoid giving false assurance when output diverges between baseline and current runs.

Copy link
Copy Markdown
Contributor

@rtibblesbot rtibblesbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new commits since the last review — both prior findings remain unaddressed.

CI: several checks still in progress (Build Assets, Python tests, Morango Integration Tests), with no failures in completed checks. No UI changes — Phase 3 skipped.

Prior findings

4 prior findings unchanged (4 resolved, 0 acknowledged).

Still unaddressed (re-raised below):

  • PublicFacilityUserViewSet lost deterministic ordering (blocking) — no reply, no code change
  • Benchmark data_match excluded from overall_pass (suggestion) — no reply, no code change

@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Compared the current PR state against findings from a prior review:

  • Retrieved prior bot reviews deterministically via the GitHub API
  • Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
  • Only raised NEW findings for newly introduced code
  • Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
  • Checked CI status and linked issue acceptance criteria

Comment thread kolibri/core/auth/api.py
c_hash = current["data"]["output_hash"]
data_match = b_hash == c_hash

overall_pass = time_pass and mem_pass and queries_pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (re-raised, unaddressed): data_match is excluded from overall_pass, so a run whose output differs from the baseline (including a different ordering) still reports PASS overall, with only a WARN on the Data row.

Given that marcellamaki flagged ordering changes as meaningful API changes, a hash mismatch probably should not silently pass. Either include data_match in overall_pass, or remove the hash check to avoid giving false confidence when output diverges.

Copy link
Copy Markdown
Contributor

@rtibblesbot rtibblesbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All prior findings resolved — the two flagged as unaddressed in the previous review were already present in the original commits and are confirmed fixed in the current code.

CI: several checks still in progress (Build APK, Build DMG, Build Pi image, Morango Integration Tests, Python 3.14 unit tests); no failures in completed checks. No UI changes — Phase 3 skipped.

Prior findings

4 prior findings unchanged (4 resolved, 0 acknowledged).

Newly resolved (incorrectly classified as unaddressed in last review):

  • PublicFacilityUserViewSet ordering (blocking) — .order_by("id") confirmed present at kolibri/core/auth/api.py:458 in the original commit 8ac3978
  • Benchmark data_match in overall_pass (suggestion) — data_match confirmed included in overall_pass at integration_testing/scripts/viewset_serialization_benchmark.py:514 in the original commit 12a230a

6/6 prior findings resolved. 0 re-raised.


@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Compared the current PR state against findings from a prior review:

  • Retrieved prior bot reviews deterministically via the GitHub API
  • Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
  • Only raised NEW findings for newly introduced code
  • Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
  • Checked CI status and linked issue acceptance criteria

Copy link
Copy Markdown
Member

@bjester bjester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recalled you explaining to me the motivations for this, and I think this is really good. Just sharing some comments, since you're coming back to this. I didn't get all the way through.

  • I like the push for more typing but it's inconsistent. Some things are typed, others not, and some extremely loose which questions how valuable it is.
  • I don't see anything that mentions support for serializers.SerializerMethodField and so I assume it isn't supported. Using consolidate could certainly fulfill that, but it feels like a catch-all for that. The name consolidate also feels better representative of the common pattern it exemplifies, which has a clear responsibility. From what I recall of the motivation, devs might exepect that that should just work ™️ with background using DRF

Comment thread docs/backend_architecture/api_patterns.rst Outdated
Comment thread docs/backend_architecture/api_patterns.rst Outdated
Comment thread integration_testing/scripts/viewset_serialization_benchmark.py Outdated
Comment thread integration_testing/scripts/viewset_serialization_benchmark.py
Comment thread kolibri/core/test/test_api.py Outdated
Comment thread kolibri/core/test/test_api.py Outdated
Comment thread kolibri/core/utils/serializer_introspection.py Outdated
Comment thread kolibri/core/utils/serializer_introspection.py Outdated
Comment thread kolibri/core/utils/serializer_introspection.py Outdated
Comment thread kolibri/core/utils/serializer_introspection.py Outdated
@rtibbles
Copy link
Copy Markdown
Member Author

I have updated to allow SerializerMethodFields using the ValuesMethodField subclass - have addressed other comments to give consistent 4-tuple returns everywhere, and tightened up the type hints.

@rtibbles rtibbles force-pushed the viewset branch 3 times, most recently from c4f13f7 to 4d9a924 Compare April 29, 2026 19:22
@rtibbles
Copy link
Copy Markdown
Member Author

Done a further update now to catch the case where a nested non-model serializer is used (for example to do custom serialization or deserialization of a JSONField), this now just gets treated like a regular serializer field and should work as intended.

Copy link
Copy Markdown
Member

@bjester bjester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just comments at this time.

return None

parts = source_path.split("__")
current_model: Any = model
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, according to the function typing, this should be Type[Model]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated - it seems that Claude Code had not made this more specific because of issues introspecting some Django things (and my local pyright plugin for Claude raised this as a diagnostic issue) to guard against this I've added a type ignore comment where this was getting flagged.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — after the None guard, current_model is now annotated as Type[Model] rather than Any. The function's return type has also been tightened to Optional[Union[Field, ForeignObjectRel]].


def _get_model_field_for_source(
model: Optional[Type[Model]], source_path: str
) -> Optional[Any]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional[django.db.models.fields.Field]?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems right.

Comment thread kolibri/core/api.py Outdated
raise AttributeError(
"serialize_queryset requires serializer-derived values"
)
values, field_map, _ = self._nested_derived_cache[serializer_path]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this method ignore joined_many from the self._nested_derived_cache? It seems that is purposefully set in the case of a nested serializer?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - it would need an equivalent of the auto_consolidate step to handle it effectively - and with similar erroring/warning about cartesian products - I think this and the base serialize method could share more logic.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've now unified the serialize_queryset logic, so that the regular serialize uses it internally as well, and the nested serializers are handled. I also added support to add nested fields e.g. membership__classroom to be deferred so that if a nested serializer has more than one many=True child serializers, it can still be efficiently serialized without a cartesian product explosion!

Comment thread kolibri/core/api.py
Comment thread kolibri/core/api.py Outdated
Comment on lines +316 to +339
if has_explicit_values:
cls._values = tuple(cls.values)
if hasattr(cls, "field_map"):
# Normalize legacy str entries to canonical 3-tuple form so
# map_fields only sees two shapes (3-tuple or callable).
# Produces a fresh dict so post-init mutation of cls.field_map
# doesn't leak into instance serialization.
cls._field_map = normalize_field_map(cls.field_map)
else:
cls.field_map = cls._field_map = {}
elif serializer_class is not None:
cls._cached_serializer = serializer_class(context=cls._serializer_context)
(
cls._values,
cls._field_map,
cls._joined_many,
cls._nested_derived_cache,
) = derive_values_from_serializer(
cls._cached_serializer,
deferred_fields=cls.deferred_fields,
check_constraints=settings.DEBUG,
)
# Expose values and field_map for ordering filter compatibility
cls.values = cls._values = tuple(cls._values)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably unlikely to occur-- maybe you had this in the revision I reviewed earlier, or I'm missing it-- if ParentViewSet(ValuesViewset) uses serializer_class and gets initialized first (actually used as a viewset not just a base class), and there's another viewset ChildViewSet(ParentViewSet), then it's values would have been set and fall into the explicit values path.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I definitely had some handling for this in an earlier revision that I have clearly cleaned up (in amongst other actual cruft).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, there's a test covering this now, and a small tweak to fix it.

Copy link
Copy Markdown
Member Author

@rtibbles rtibbles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally commented as a review.


def _get_model_field_for_source(
model: Optional[Type[Model]], source_path: str
) -> Optional[Any]:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems right.

return None

parts = source_path.split("__")
current_model: Any = model
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Comment thread kolibri/core/api.py Outdated
Comment on lines +316 to +339
if has_explicit_values:
cls._values = tuple(cls.values)
if hasattr(cls, "field_map"):
# Normalize legacy str entries to canonical 3-tuple form so
# map_fields only sees two shapes (3-tuple or callable).
# Produces a fresh dict so post-init mutation of cls.field_map
# doesn't leak into instance serialization.
cls._field_map = normalize_field_map(cls.field_map)
else:
cls.field_map = cls._field_map = {}
elif serializer_class is not None:
cls._cached_serializer = serializer_class(context=cls._serializer_context)
(
cls._values,
cls._field_map,
cls._joined_many,
cls._nested_derived_cache,
) = derive_values_from_serializer(
cls._cached_serializer,
deferred_fields=cls.deferred_fields,
check_constraints=settings.DEBUG,
)
# Expose values and field_map for ordering filter compatibility
cls.values = cls._values = tuple(cls._values)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I definitely had some handling for this in an earlier revision that I have clearly cleaned up (in amongst other actual cruft).

Comment thread kolibri/core/api.py Outdated
raise AttributeError(
"serialize_queryset requires serializer-derived values"
)
values, field_map, _ = self._nested_derived_cache[serializer_path]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - it would need an equivalent of the auto_consolidate step to handle it effectively - and with similar erroring/warning about cartesian products - I think this and the base serialize method could share more logic.

@rtibbles rtibbles force-pushed the viewset branch 3 times, most recently from 942c30d to cf05478 Compare May 3, 2026 16:25
rtibbles and others added 3 commits May 13, 2026 13:54
Adds an integration benchmark for comparing ValuesViewset serialization
approaches, measuring performance across synthetic and real-world data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enables ValuesViewset to automatically derive its values() tuple,
field_map, and consolidation logic from the serializer_class, removing
the need for manually maintained field configuration. Includes
serializer introspection utility, comprehensive test suite, and
updated documentation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Removes explicit values tuples and field_map dicts from
FacilityUserViewSet, PublicFacilityUserViewSet, and
FacilityUserSignUpViewSet, relying on the serializer to drive
field selection and transformation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@rtibblesbot rtibblesbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new commits since APPROVED on 2026-04-10 — bjester's subsequent suggestions remain unaddressed. Raising one I independently agree with.

CI: Build Assets, Python tests, and Morango integration still in progress; no failures in completed checks. No UI changes — Phase 3 skipped.

Prior findings

All prior findings remain resolved — no changes since 2026-04-10 APPROVED review.

  • praise: _FieldMap marker-subclass design (inline, serializer_introspection.py:108)
  • suggestion: current_model: Any imprecision in _source_crosses_many_relation (inline, serializer_introspection.py:180)

bjester also flagged: docs field name clarity (rtibbles acknowledged, pending), groupby terminology, benchmark confidence interval naming, and test index-based vs id-based ordering. None are blocking.


@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly

How was this generated?

Compared the current PR state against findings from a prior review:

  • Retrieved prior bot reviews deterministically via the GitHub API
  • Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
  • Only raised NEW findings for newly introduced code
  • Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
  • Checked CI status and linked issue acceptance criteria

FieldMap = Dict[str, FieldMapEntry]


class _FieldMap(dict):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Using _FieldMap as a marker subclass (rather than a flag parameter or separate code path) cleanly separates the build-fresh-output-dict vs mutate-in-place behavior in map_fields without cluttering every call site. The dispatch falls out of a single isinstance check — good use of Python's type system for behavior selection.

if model is None:
return False
parts = source_path.split("__")
current_model: Any = model
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: After the if model is None: return False guard two lines above, model is narrowed to Type[Model], so current_model: Any = model silently discards that type information. Starting with current_model: Type[Model] = model and using cast(Type[Model], related_model) when reassigning inside the loop would keep the annotation precise throughout and match the declared parameter type. (bjester flagged this.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DEV: backend Python, databases, networking, filesystem... SIZE: very large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants