perf: memoise [Lib.closure] keyed on (linking, for_, libs)#14521
Draft
robinbb wants to merge 8 commits into
Draft
perf: memoise [Lib.closure] keyed on (linking, for_, libs)#14521robinbb wants to merge 8 commits into
robinbb wants to merge 8 commits into
Conversation
This was referenced May 13, 2026
ee1f8ce to
58ed81a
Compare
215bfef to
bf10fd7
Compare
58ed81a to
ec1ccdb
Compare
bf10fd7 to
6470974
Compare
ec1ccdb to
8054767
Compare
4f3d261 to
8ed3374
Compare
robinbb
added a commit
that referenced
this pull request
May 16, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module library file dependency narrowing introduced in #14492 (split into PRs #14513..#14521 as layers L1..L9): - The motivation and soundness model. - The [can_filter] precondition and [has_virtual_impl] early-out. - The narrowing pipeline: read ocamldep raw refs → [referenced] → [Lib.closure] → cross-library BFS → classification → emit per-lib deps and filtered include flags. - The data structures used ([Lib_index], the per-cctx [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos). - Soundness fallbacks (wrapped libs, virtual impls, ppx runtime). - A source map locating each concern in [src/dune_rules/]. - A layer-by-layer summary of #14513..#14521. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
robinbb
added a commit
that referenced
this pull request
May 16, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module library file dependency narrowing introduced in this PR (split into PRs #14513..#14521 as layers L1..L9): - The motivation and soundness model. - The [can_filter] precondition and [has_virtual_impl] early-out. - The narrowing pipeline: read ocamldep raw refs → [referenced] → [Lib.closure] → cross-library BFS → classification → emit per-lib deps and filtered include flags. - The data structures used ([Lib_index], the per-cctx [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos). - Soundness fallbacks (wrapped libs, virtual impls, ppx runtime). - A source map locating each concern in [src/dune_rules/]. - A layer-by-layer summary of #14513..#14521. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Restores correctness for three cases the bare BFS filter mishandles: - Deps that implement a virtual library: dep-graph through them is computed elsewhere ([Dep_rules.imported_vlib_deps]); the per-module filter can miss cmi changes. Gate: fall through to glob whenever the cctx has [has_virtual_impl]. - Wrapped local libs the consumer references through the wrapper name: the ocamldep walk can't see the alias chain into the lib's [wrapped_compat] / inner modules. Reach: glob the wrapped lib's [Lib.closure]. - [ppx_runtime_libraries] introduced by [pps] in the consumer's preprocessor: their modules appear in the post-pp source which ocamldep can't see. Reach: glob their [Lib.closure]. [Module_compilation.lib_deps_for_module]: - After [can_filter], read [Compilation_context.has_virtual_impl]; if true, fall back to glob. - Read [Compilation_context.pps_runtime_libs] and include them in [direct_libs] so [Lib.closure] sees them. - Compute [wrapped_libs_referenced] from the consumer's [referenced_modules] (BFS-initial frontier — pre-cross-lib-walk). Take the [Lib.closure] of that set union [pps_runtime_libs] to get [must_glob_libs]; the classification fold sends every member to the glob path. [Modules]: - [Wrapped.entry_modules]: new function. Returns the wrapper ([lib_interface]) plus every [wrapped_compat] shim. Mirrors what [(wrapped (transition ...))] libraries expose to consumers. - [entry_modules]'s wrapped case switches to use it. Net effect: in transition wrapped libs, consumers can resolve any of the bare module names the lib exposes; this lifts a false-negative in the index that previously hid the consumer's reference to a [wrapped_compat] shim from the per-module filter. Tests (cherry-picked from #14492): - New soundness fixtures land here: [cross-lib-instrumentation-barrier.t], [cross-lib-preprocess-barrier.t], [cross-lib-pps-runtime-no-ocamldep-barrier.t], [wrapped-from-vlib-soundness.t], [wrapped-transition-soundness.t], [mixed-per-module-preprocess.t], [mixed-per-module-preprocess-precision.t], [cmx-native-tight-deps.t]. - The five pre-existing tests broken by L4 ([auto-wrapped-child-reexport.t], [ppx-runtime-libraries.t], [virtual-library.t], [wrapped-closure-precision.t], [wrapped-reexport-via-open-flag.t]) pass again — soundness recovery restores their original behavior; no test file change in #14492's diff for them. Changelog: [doc/changes/added/14492.md] lands now. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Closes a soundness gap in the per-module narrowing pipeline missed by the wrapped / ppx-runtime / virtual-impl recoveries: when a dependency library's stanza injects [-open M] via [(flags (...))], its source can reference [M]'s identifiers without naming [M], and ocamldep emits no token to drive the cross-library walker. Reported by RyanJamesStewart on #14517 with this fixture: an unwrapped [middle] depending on unwrapped [prelude] with [(flags (:standard -open Prelude))], exposing [val pick : unit -> color] (where [color] resolves through the open to [Prelude.color]). The consumer pattern-matches the result against bare constructors. The compile genuinely needs [prelude.cmi] to resolve the constructors; the BFS over [ocamldep -modules] cannot reach [prelude] (no syntactic [Prelude] token on either side); the three existing recoveries do not catch it ([prelude] is not wrapped, not a ppx-runtime lib, not a virtual-impl). [Module_compilation.cross_lib_tight_set]: - Add [~mode] (the consumer's compile mode) so we can expand a dep lib's stanza flags via [Ocaml_flags.get]. - Extend [read_entry_deps] to compute [read_stanza_opens] for the visited lib and union the result into the BFS frontier. Localised in the BFS rather than the initial-frontier computation so the reachability rule reads as: a module is reachable iff the consumer references it, or some reached module's ocamldep names it, or some reached module's owning lib stanza-opens it. Returns empty for [Spec.standard] (short-circuits external libs). [Lib_info]: - Add [stanza_flags : Dune_lang.Ocaml_flags.Spec.t] field plus accessor. Local libs carry their stanza's [conf.buildable.flags]; external libs ([findlib], [dune_package]) carry [Spec.standard]. Regression test: [cross-lib-open-flag-barrier.t]. Fails on L4 and L5 head before this patch (Unbound constructor Green under [--sandbox=copy]); passes after.
[Compilation_context.filtered_include_flags]: new function returning the [-I]/[-H] flags restricted to [kept_libs]. The cctx's [requires_compile] and [requires_hidden] are each filtered by [Lib.Set.mem kept_libs]; the result is built as a single [Command.Args.t] under [Action_builder]. No caching yet — each call recomputes; a follow-up adds the cache. [Module_compilation.lib_deps_for_module]: the tight branch was already threading [kept_libs] through the classification fold (it had been unused at L4-L5). Now wired into [filtered_include_flags]; the returned pair is [(filtered_include_flags, tight_deps + glob_deps)] instead of [(cctx_includes_for_cm_kind (), …)]. Behavioural effect: a consumer module's compile command sees [-I] / [-H] only for libraries its ocamldep reference set actually reaches. Adding an unreferenced sibling to the cctx's [(libraries ...)] no longer changes the consumer module's compile command, so the rule does not re-execute. Tests: - [per-module-include-flags.t]: promoted — [-I] for the unreferenced [unrelated_lib] no longer appears in the consumer's compile rule. - [add-unreferenced-sibling-lib.t]: promoted — adding an unreferenced sibling lib produces no rebuild for consumer modules. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
[Compilation_context.Filtered_includes] caches the [Action_builder.t] returned by [filtered_include_flags] keyed on [(lib_mode, kept_libs)]. Two modules in the same cctx that reach the same set of kept libs share one builder; [Action_builder.memoize] dedupes its evaluation. Cache key omits the cctx's [requires_compile] / [requires_hidden] — they're immutable on the cctx from [create]. The [for_module_generated_at_link_time] exception, where derived cctxs could in principle alter the closure, takes [can_filter = false] in [lib_deps_for_module] and so never reaches this function. [Filtered_includes.Key]: [lib_mode] + [kept_libs : Lib.t list] (the caller passes a sorted list via [Lib.Set.to_list], canonicalising for the cache). [equal] and [hash] derived from the same; [Repr]-derived [to_dyn] for diagnostics. [Lib_mode.hash]: new — used by [Filtered_includes.Key.hash]. Three constants for the three variants ([Ocaml Byte], [Ocaml Native], [Melange]). Signed-off-by: Robin Bate Boerop <me@robinbb.com>
[Compilation_context.Raw_refs] caches the [Action_builder.t] computed for each ocamldep raw-deps read inside a cctx. Two consumer modules that share trans_deps (or a consumer and one of its trans deps that share an [obj_name + ml_kind]) get the same builder. The cache short-circuits before constructing the builder; on hit, no allocation. [Raw_refs.Key] distinguishes the two read patterns the per-module filter uses: [Consumer] (the cctx-driving module's own deps, keyed by [ml_kind]) and [Transitive] (a dep module's deps, keyed by [cm_kind] because the impl/intf gating in [need_impl_deps_of] varies by cm_kind on the [Cmx]/opaque path). Conservatively-distinct keying — never collapse two semantically-different reads under one cache cell. [Compilation_context.cached_raw_refs t ~key ~compute] is the thin public surface: lookup, compute on miss, store, return the builder. [Module_compilation.lib_deps_for_module]: wraps the inline [read_dep_m_raw] body that the BFS uses for both the consumer's own and each trans dep's raw refs. No semantic change — the cache only deduplicates builder construction across calls within the same cctx. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
The per-module filter calls [Lib.closure] twice per consumer module (once for [direct_libs], once for [must_glob_libs]) on each compile rule. Across a cctx, many modules pass overlapping inputs to these closures; without memoisation every call re-traverses the dependency graph. [Lib.closure] is now defined as [Memo.exec] over a [Memo.create] keyed on [(bool * Compilation_mode.t * t list)]. The list-of-libs key is order- and multiplicity-sensitive, so callers that share inputs need to canonicalise (sort by [Lib.compare]) for maximum cache reuse — [lib_deps_for_module] already does this at both call sites. A docstring on [val closure] notes the requirement. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Add [doc/dev/per-module-narrowing.md] describing the per-module library file dependency narrowing introduced in #14492 (split into PRs #14513..#14521 as layers L1..L9): - The motivation and soundness model. - The [can_filter] precondition and [has_virtual_impl] early-out. - The narrowing pipeline: read ocamldep raw refs → [referenced] → [Lib.closure] → cross-library BFS → classification → emit per-lib deps and filtered include flags. - The data structures used ([Lib_index], the per-cctx [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos). - Soundness fallbacks (wrapped libs, virtual impls, ppx runtime). - A source map locating each concern in [src/dune_rules/]. - A layer-by-layer summary of #14513..#14521. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Documents the fourth class of soundness recovery added in L5: the BFS's per-iteration step now extends the frontier with the modules named by each visited library's stanza [-open] flags, in addition to the entry's impl + intf ocamldep raw refs. The reachability rule the BFS computes becomes a three-way disjunction (consumer references it; reached module's ocamldep names it; reached module's owning lib stanza-opens it). Updates the [cross_lib_tight_set] code snippet's signature (now threads [~mode]), the per-iteration description, the "Soundness recovery and known edge cases" list (adds a fourth class), the L5 layer-summary line, and the cost-characteristics list (adds the per-visited-lib stanza-flags expansion).
8ed3374 to
356728c
Compare
robinbb
added a commit
that referenced
this pull request
May 20, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module library file dependency narrowing introduced in #14492 (split into PRs #14513..#14521 as layers L1..L9): - The motivation and soundness model. - The [can_filter] precondition and [has_virtual_impl] early-out. - The narrowing pipeline: read ocamldep raw refs → [referenced] → [Lib.closure] → cross-library BFS → classification → emit per-lib deps and filtered include flags. - The data structures used ([Lib_index], the per-cctx [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos). - Soundness fallbacks (wrapped libs, virtual impls, ppx runtime). - A source map locating each concern in [src/dune_rules/]. - A layer-by-layer summary of #14513..#14521. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
robinbb
added a commit
that referenced
this pull request
May 20, 2026
Add [doc/dev/per-module-narrowing.md] describing the per-module library file dependency narrowing introduced in #14492 (split into PRs #14513..#14521 as layers L1..L9): - The motivation and soundness model. - The [can_filter] precondition and [has_virtual_impl] early-out. - The narrowing pipeline: read ocamldep raw refs → [referenced] → [Lib.closure] → cross-library BFS → classification → emit per-lib deps and filtered include flags. - The data structures used ([Lib_index], the per-cctx [cached_raw_refs] / [Filtered_includes] / [Lib.closure] memos). - Soundness fallbacks (wrapped libs, virtual impls, ppx runtime). - A source map locating each concern in [src/dune_rules/]. - A layer-by-layer summary of #14513..#14521. Signed-off-by: Robin Bate Boerop <me@robinbb.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Layer 9 of 9 of #14492. Pure performance.
Lib.closureis now defined asMemo.execover aMemo.createkeyed on(linking, for_, libs). The per-module filter callsLib.closuretwice per consumer module (once fordirect_libs, once formust_glob_libs); without memoisation every call re-traverses the dependency graph.The list-of-libs key is order- and multiplicity-sensitive — callers that share inputs (e.g.
lib_deps_for_moduleat both call sites) need to canonicalise viaList.sort_uniq ~compare:Lib.compare; the existing call sites already do.Stack: rebases on #14520. Final layer of the stack.
Part of #14492. Related to #4572.