Skip to content

Automatically defer nested serializer fields that cannot be joined in a single query #14815

@rtibbles

Description

@rtibbles

This issue is not open for contribution. Visit Contributing guidelines to learn about the contributing process and how to find suitable issues.

Overview

Extend the serializer introspection behind BaseValuesViewset so that nested serializer configurations that cannot be joined in a single query — multiple many=True nested serializers at one level, deep nesting, single-FK nesting, and M2M relations — are deferred automatically and fetched as separate batched queries, instead of raising and requiring manual deferred_fields plus a custom consolidate().

Complexity: High
Target branch: develop

Context

The serializer-derived pattern (#14036) makes the serializer the single source of truth for API shape. The infrastructure merged in #14327 enforces constraints at introspection time: at most one joined many=True nested serializer per level (more would produce a cartesian product) and no deep nesting of joined nested serializers — both raise a TypeError pointing at deferred_fields plus a custom consolidate().

That escape hatch reintroduces exactly the manual plumbing the refactor is meant to remove. The efficient query strategy for these shapes is nearly always the same fixed pattern — one batched query per nested field, bucketed back onto the parent rows — so the framework can apply it automatically rather than asking each viewset to hand-write it.

The motivating case is #14300: the ContentNode serializer needs files (reverse FK, itself nesting lang), assessmentmetadata (reverse FK), and tags (M2M) — under the current constraints all but one must be manually deferred and consolidated.

The Change

At introspection time, detect nested serializer configurations that cannot be joined in a single query and defer them automatically — no serializer shape raises at viewset instantiation any more.

Each auto-deferred field is fetched with one batched query against the child model, filtered by the parent PKs and bucketed back onto the parent rows by the relation. All relation types are supported: reverse FK, M2M, and single FK. Fetched child rows run through the same consolidation pipeline, so deeper auto-deferred chains stay batched per level — query count scales with nesting depth, not row count.

Where multiple auto-deferred fetches target the same related model (e.g. lang on both ContentNode and its File rows), the fetches are merged into a single query against that model and the results distributed to each referencing level.

Explicit deferred_fields remains the developer's contract: a field listed there is never auto-fetched, and stays the right tool when the fetch needs annotations, filtering, or any logic beyond plain fetch-and-embed.

Each auto-defer decision emits a DEBUG-level log so automatic query expansion is visible without inspecting introspection internals.

Out of Scope

Acceptance Criteria

  • Multiple many=True nested serializers at one level serialize correctly via automatic deferral
  • Deep nesting auto-defers the outer field, for both many=True and single-FK outer serializers
  • M2M nested serializers auto-defer correctly
  • Auto-deferred fetches targeting the same related model are merged into a single query
  • Query budgets pinned with assertNumQueries: one batched query per auto-deferred field per level, shared-target fetches counted once
  • No serializer shape raises at viewset instantiation; the constraint-raise tests are removed/replaced accordingly
  • Explicit deferred_fields passthrough unchanged — the developer's consolidate() runs untouched
  • Each auto-defer decision emits a DEBUG-level log
  • Behaviour asserted end-to-end through viewset.serialize(), following the existing conventions in kolibri/core/test/test_api.py
  • Synthetic benchmark enhanced to compare an auto-deferred serializer-derived viewset against an equivalent explicit values/field_map + manual consolidate() viewset, demonstrating comparable performance
  • docs/backend_architecture/api_patterns.rst documents automatic deferral and its interaction with explicit deferred_fields

References

AI usage

This issue was written with Claude Code, section by section under my direction. The underlying design came out of an earlier Claude-assisted planning session, where I have already prototyped the reverse-FK deferral against a pre-merge version of #14327 to validate the approach — the scope here (M2M and single-FK support, merged same-model fetches, benchmark comparison) reflects my decisions about what that prototype showed was needed.

Metadata

Metadata

Assignees

Labels

DEV: backendPython, databases, networking, filesystem...TAG: tech update / debtChange not visible to user

Type

No fields configured for Task.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions