Skip to content

perf: memoise [Lib.closure] keyed on (linking, for_, libs)#14521

Draft
robinbb wants to merge 8 commits into
robinbb-14492-l8-raw-refs-cachefrom
robinbb-14492-l9-lib-closure-memo
Draft

perf: memoise [Lib.closure] keyed on (linking, for_, libs)#14521
robinbb wants to merge 8 commits into
robinbb-14492-l8-raw-refs-cachefrom
robinbb-14492-l9-lib-closure-memo

Conversation

@robinbb
Copy link
Copy Markdown
Collaborator

@robinbb robinbb commented May 13, 2026

Layer 9 of 9 of #14492. Pure performance.

Lib.closure is now defined as Memo.exec over a Memo.create keyed on (linking, for_, libs). The per-module filter calls Lib.closure twice per consumer module (once for direct_libs, once for must_glob_libs); without memoisation every call re-traverses the dependency graph.

The list-of-libs key is order- and multiplicity-sensitive — callers that share inputs (e.g. lib_deps_for_module at both call sites) need to canonicalise via List.sort_uniq ~compare:Lib.compare; the existing call sites already do.

Stack: rebases on #14520. Final layer of the stack.

Part of #14492. Related to #4572.

@robinbb robinbb self-assigned this May 13, 2026
@robinbb robinbb force-pushed the robinbb-14492-l8-raw-refs-cache branch from ee1f8ce to 58ed81a Compare May 14, 2026 00:36
@robinbb robinbb force-pushed the robinbb-14492-l9-lib-closure-memo branch 2 times, most recently from 215bfef to bf10fd7 Compare May 14, 2026 18:38
@robinbb robinbb force-pushed the robinbb-14492-l8-raw-refs-cache branch from 58ed81a to ec1ccdb Compare May 14, 2026 23:58
@robinbb robinbb force-pushed the robinbb-14492-l9-lib-closure-memo branch from bf10fd7 to 6470974 Compare May 14, 2026 23:58
@robinbb robinbb force-pushed the robinbb-14492-l8-raw-refs-cache branch from ec1ccdb to 8054767 Compare May 15, 2026 02:57
@robinbb robinbb force-pushed the robinbb-14492-l9-lib-closure-memo branch 2 times, most recently from 4f3d261 to 8ed3374 Compare May 16, 2026 03:59
robinbb added a commit that referenced this pull request May 16, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module
library file dependency narrowing introduced in #14492 (split into
PRs #14513..#14521 as layers L1..L9):

- The motivation and soundness model.
- The [can_filter] precondition and [has_virtual_impl] early-out.
- The narrowing pipeline: read ocamldep raw refs → [referenced] →
  [Lib.closure] → cross-library BFS → classification → emit per-lib
  deps and filtered include flags.
- The data structures used ([Lib_index], the per-cctx
  [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos).
- Soundness fallbacks (wrapped libs, virtual impls, ppx runtime).
- A source map locating each concern in [src/dune_rules/].
- A layer-by-layer summary of #14513..#14521.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
robinbb added a commit that referenced this pull request May 16, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module
library file dependency narrowing introduced in this PR (split into
PRs #14513..#14521 as layers L1..L9):

- The motivation and soundness model.
- The [can_filter] precondition and [has_virtual_impl] early-out.
- The narrowing pipeline: read ocamldep raw refs → [referenced] →
  [Lib.closure] → cross-library BFS → classification → emit per-lib
  deps and filtered include flags.
- The data structures used ([Lib_index], the per-cctx
  [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos).
- Soundness fallbacks (wrapped libs, virtual impls, ppx runtime).
- A source map locating each concern in [src/dune_rules/].
- A layer-by-layer summary of #14513..#14521.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
robinbb added 8 commits May 17, 2026 17:01
Restores correctness for three cases the bare BFS filter mishandles:
- Deps that implement a virtual library: dep-graph through them is
  computed elsewhere ([Dep_rules.imported_vlib_deps]); the per-module
  filter can miss cmi changes. Gate: fall through to glob whenever the
  cctx has [has_virtual_impl].
- Wrapped local libs the consumer references through the wrapper name:
  the ocamldep walk can't see the alias chain into the lib's
  [wrapped_compat] / inner modules. Reach: glob the wrapped lib's
  [Lib.closure].
- [ppx_runtime_libraries] introduced by [pps] in the consumer's
  preprocessor: their modules appear in the post-pp source which
  ocamldep can't see. Reach: glob their [Lib.closure].

[Module_compilation.lib_deps_for_module]:
- After [can_filter], read [Compilation_context.has_virtual_impl]; if
  true, fall back to glob.
- Read [Compilation_context.pps_runtime_libs] and include them in
  [direct_libs] so [Lib.closure] sees them.
- Compute [wrapped_libs_referenced] from the consumer's
  [referenced_modules] (BFS-initial frontier — pre-cross-lib-walk).
  Take the [Lib.closure] of that set union [pps_runtime_libs] to get
  [must_glob_libs]; the classification fold sends every member to the
  glob path.

[Modules]:
- [Wrapped.entry_modules]: new function. Returns the wrapper
  ([lib_interface]) plus every [wrapped_compat] shim. Mirrors what
  [(wrapped (transition ...))] libraries expose to consumers.
- [entry_modules]'s wrapped case switches to use it. Net effect: in
  transition wrapped libs, consumers can resolve any of the bare
  module names the lib exposes; this lifts a false-negative in the
  index that previously hid the consumer's reference to a
  [wrapped_compat] shim from the per-module filter.

Tests (cherry-picked from #14492):
- New soundness fixtures land here:
  [cross-lib-instrumentation-barrier.t], [cross-lib-preprocess-barrier.t],
  [cross-lib-pps-runtime-no-ocamldep-barrier.t],
  [wrapped-from-vlib-soundness.t], [wrapped-transition-soundness.t],
  [mixed-per-module-preprocess.t], [mixed-per-module-preprocess-precision.t],
  [cmx-native-tight-deps.t].
- The five pre-existing tests broken by L4
  ([auto-wrapped-child-reexport.t], [ppx-runtime-libraries.t],
  [virtual-library.t], [wrapped-closure-precision.t],
  [wrapped-reexport-via-open-flag.t]) pass again — soundness
  recovery restores their original behavior; no test file change in
  #14492's diff for them.

Changelog: [doc/changes/added/14492.md] lands now.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Closes a soundness gap in the per-module narrowing pipeline missed
by the wrapped / ppx-runtime / virtual-impl recoveries: when a
dependency library's stanza injects [-open M] via [(flags (...))],
its source can reference [M]'s identifiers without naming [M], and
ocamldep emits no token to drive the cross-library walker.

Reported by RyanJamesStewart on #14517 with this fixture: an
unwrapped [middle] depending on unwrapped [prelude] with
[(flags (:standard -open Prelude))], exposing
[val pick : unit -> color] (where [color] resolves through the open
to [Prelude.color]). The consumer pattern-matches the result
against bare constructors. The compile genuinely needs [prelude.cmi]
to resolve the constructors; the BFS over [ocamldep -modules]
cannot reach [prelude] (no syntactic [Prelude] token on either
side); the three existing recoveries do not catch it ([prelude] is
not wrapped, not a ppx-runtime lib, not a virtual-impl).

[Module_compilation.cross_lib_tight_set]:
- Add [~mode] (the consumer's compile mode) so we can expand a dep
  lib's stanza flags via [Ocaml_flags.get].
- Extend [read_entry_deps] to compute [read_stanza_opens] for the
  visited lib and union the result into the BFS frontier. Localised
  in the BFS rather than the initial-frontier computation so the
  reachability rule reads as: a module is reachable iff the consumer
  references it, or some reached module's ocamldep names it, or some
  reached module's owning lib stanza-opens it. Returns empty for
  [Spec.standard] (short-circuits external libs).

[Lib_info]:
- Add [stanza_flags : Dune_lang.Ocaml_flags.Spec.t] field plus
  accessor. Local libs carry their stanza's [conf.buildable.flags];
  external libs ([findlib], [dune_package]) carry [Spec.standard].

Regression test: [cross-lib-open-flag-barrier.t]. Fails on L4 and
L5 head before this patch (Unbound constructor Green under
[--sandbox=copy]); passes after.
[Compilation_context.filtered_include_flags]: new function returning the
[-I]/[-H] flags restricted to [kept_libs]. The cctx's [requires_compile]
and [requires_hidden] are each filtered by [Lib.Set.mem kept_libs]; the
result is built as a single [Command.Args.t] under [Action_builder]. No
caching yet — each call recomputes; a follow-up adds the cache.

[Module_compilation.lib_deps_for_module]: the tight branch was already
threading [kept_libs] through the classification fold (it had been
unused at L4-L5). Now wired into [filtered_include_flags]; the returned
pair is [(filtered_include_flags, tight_deps + glob_deps)] instead of
[(cctx_includes_for_cm_kind (), …)].

Behavioural effect: a consumer module's compile command sees [-I] /
[-H] only for libraries its ocamldep reference set actually reaches.
Adding an unreferenced sibling to the cctx's [(libraries ...)] no
longer changes the consumer module's compile command, so the rule does
not re-execute.

Tests:
- [per-module-include-flags.t]: promoted — [-I] for the unreferenced
  [unrelated_lib] no longer appears in the consumer's compile rule.
- [add-unreferenced-sibling-lib.t]: promoted — adding an unreferenced
  sibling lib produces no rebuild for consumer modules.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
[Compilation_context.Filtered_includes] caches the [Action_builder.t]
returned by [filtered_include_flags] keyed on
[(lib_mode, kept_libs)]. Two modules in the same cctx that reach the
same set of kept libs share one builder; [Action_builder.memoize]
dedupes its evaluation.

Cache key omits the cctx's [requires_compile] / [requires_hidden] —
they're immutable on the cctx from [create]. The
[for_module_generated_at_link_time] exception, where derived cctxs
could in principle alter the closure, takes [can_filter = false] in
[lib_deps_for_module] and so never reaches this function.

[Filtered_includes.Key]: [lib_mode] + [kept_libs : Lib.t list] (the
caller passes a sorted list via [Lib.Set.to_list], canonicalising for
the cache). [equal] and [hash] derived from the same; [Repr]-derived
[to_dyn] for diagnostics.

[Lib_mode.hash]: new — used by [Filtered_includes.Key.hash]. Three
constants for the three variants ([Ocaml Byte], [Ocaml Native],
[Melange]).

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
[Compilation_context.Raw_refs] caches the [Action_builder.t] computed
for each ocamldep raw-deps read inside a cctx. Two consumer modules
that share trans_deps (or a consumer and one of its trans deps that
share an [obj_name + ml_kind]) get the same builder. The cache
short-circuits before constructing the builder; on hit, no allocation.

[Raw_refs.Key] distinguishes the two read patterns the per-module
filter uses: [Consumer] (the cctx-driving module's own deps, keyed by
[ml_kind]) and [Transitive] (a dep module's deps, keyed by [cm_kind]
because the impl/intf gating in [need_impl_deps_of] varies by cm_kind
on the [Cmx]/opaque path). Conservatively-distinct keying — never
collapse two semantically-different reads under one cache cell.

[Compilation_context.cached_raw_refs t ~key ~compute] is the thin
public surface: lookup, compute on miss, store, return the builder.

[Module_compilation.lib_deps_for_module]: wraps the inline
[read_dep_m_raw] body that the BFS uses for both the consumer's own
and each trans dep's raw refs. No semantic change — the cache only
deduplicates builder construction across calls within the same cctx.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
The per-module filter calls [Lib.closure] twice per consumer module
(once for [direct_libs], once for [must_glob_libs]) on each compile
rule. Across a cctx, many modules pass overlapping inputs to these
closures; without memoisation every call re-traverses the dependency
graph.

[Lib.closure] is now defined as [Memo.exec] over a [Memo.create]
keyed on [(bool * Compilation_mode.t * t list)]. The list-of-libs key
is order- and multiplicity-sensitive, so callers that share inputs
need to canonicalise (sort by [Lib.compare]) for maximum cache reuse —
[lib_deps_for_module] already does this at both call sites. A
docstring on [val closure] notes the requirement.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Add [doc/dev/per-module-narrowing.md] describing the per-module
library file dependency narrowing introduced in #14492 (split into
PRs #14513..#14521 as layers L1..L9):

- The motivation and soundness model.
- The [can_filter] precondition and [has_virtual_impl] early-out.
- The narrowing pipeline: read ocamldep raw refs → [referenced] →
  [Lib.closure] → cross-library BFS → classification → emit per-lib
  deps and filtered include flags.
- The data structures used ([Lib_index], the per-cctx
  [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos).
- Soundness fallbacks (wrapped libs, virtual impls, ppx runtime).
- A source map locating each concern in [src/dune_rules/].
- A layer-by-layer summary of #14513..#14521.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Documents the fourth class of soundness recovery added in L5: the
BFS's per-iteration step now extends the frontier with the modules
named by each visited library's stanza [-open] flags, in addition
to the entry's impl + intf ocamldep raw refs. The reachability rule
the BFS computes becomes a three-way disjunction (consumer
references it; reached module's ocamldep names it; reached module's
owning lib stanza-opens it).

Updates the [cross_lib_tight_set] code snippet's signature (now
threads [~mode]), the per-iteration description, the "Soundness
recovery and known edge cases" list (adds a fourth class), the L5
layer-summary line, and the cost-characteristics list (adds the
per-visited-lib stanza-flags expansion).
@robinbb robinbb force-pushed the robinbb-14492-l9-lib-closure-memo branch from 8ed3374 to 356728c Compare May 18, 2026 00:07
robinbb added a commit that referenced this pull request May 20, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module
library file dependency narrowing introduced in #14492 (split into
PRs #14513..#14521 as layers L1..L9):

- The motivation and soundness model.
- The [can_filter] precondition and [has_virtual_impl] early-out.
- The narrowing pipeline: read ocamldep raw refs → [referenced] →
  [Lib.closure] → cross-library BFS → classification → emit per-lib
  deps and filtered include flags.
- The data structures used ([Lib_index], the per-cctx
  [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos).
- Soundness fallbacks (wrapped libs, virtual impls, ppx runtime).
- A source map locating each concern in [src/dune_rules/].
- A layer-by-layer summary of #14513..#14521.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
robinbb added a commit that referenced this pull request May 20, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module
library file dependency narrowing introduced in #14492 (split into
PRs #14513..#14521 as layers L1..L9):

- The motivation and soundness model.
- The [can_filter] precondition and [has_virtual_impl] early-out.
- The narrowing pipeline: read ocamldep raw refs → [referenced] →
  [Lib.closure] → cross-library BFS → classification → emit per-lib
  deps and filtered include flags.
- The data structures used ([Lib_index], the per-cctx
  [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos).
- Soundness fallbacks (wrapped libs, virtual impls, ppx runtime).
- A source map locating each concern in [src/dune_rules/].
- A layer-by-layer summary of #14513..#14521.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant