Update names field in source maps#8068
Merged
Merged
Conversation
This scans the code after optimizations and removes unused function names from 'names' field in the source map, reducing its size. Emscripten has not been generating 'names' field so far, but after #???, it will generate the field in case `llvm-dwarfdump` supports a new option `--filter-child-tag`.
aheejin
added a commit
to aheejin/emscripten
that referenced
this pull request
Nov 26, 2025
This adds support for `names` field in source maps, which contains function names. Source map mappings are correspondingly updated and emsymbolizer now can provide function name information only with source maps. While source maps don't provide the full inlined hierarchies, this provides the name of the original (= pre-inlining) function, which may not exist in the final binary because they were inlined. This is because source maps are primarily intended for user debugging. This also demangles C++ function names using `llvm-cxxfilt`, so the printed names can be human-readable. I tested with `wasm-opt.wasm` from Binaryen by `if (EMSCRIPTEN)` setup here: https://github.com/WebAssembly/binaryen/blob/95b2cf0a4ab2386f099568c5c61a02163770af32/CMakeLists.txt#L311-L372 with `-g -gsource-map`. With this PR and WebAssembly/binaryen#8068, the source map file size increases by 3.5x (8632423 -> 30070042) primarily due to the function name strings. From `llvm-dwarfdump` output, this also requires additional parsing of `DW_TAG_subprogram` and `DW_TAG_inlined_subroutine` tags which can be at any depths (because functions can be within nested namespaces or classes), so we cannot use `--recurse-depth=0` (emscripten-core#9580) anymore. In case of `wasm-opt.wasm` built with DWARF info, without `--recurse-depth=0` in the command line, the size of its text output increased by 27.5x, but with the `--filter-child-tag` / `-t` option (llvm/llvm-project#165720), the text output increased only (?) by 3.2x, which I think is tolerable. This disables `names` field generation when `-t` option is not available in `llvm-dwarfdump` because it was added recently. To avoid this text size problem, we can consider using DWARF-parsing Python libraries like https://github.com/eliben/pyelftools, but this will make another third party dependency, so I'm not sure if it's worth it at this point. This also increased running time of `wasm-sourcemap.py`, in case of the `wasm-opt.wasm`, by 2.3x (6.6s -> 15.4s), but compared to the linking time this was not very noticeable. Fixes emscripten-core#20715 and closes emscripten-core#25116.
Member
Author
|
Ping 😄 |
kripken
reviewed
Dec 4, 2025
|
|
||
| // Create the new list of names and the mapping from old to new indices. | ||
| uint32_t newIndex = 0; | ||
| for (auto& pair : oldToNewIndex) { |
Member
There was a problem hiding this comment.
This is on an unordered map, so the indexing may end up nondeterministic, I worry?
Member
Author
There was a problem hiding this comment.
Changed oldToNewIndex to std::map.
| ;;@ src.cpp:2:1:used | ||
| (nop) | ||
| ) | ||
| ) |
Member
There was a problem hiding this comment.
Perhaps add another used symbol, to get some coverage of the order of the symbols being deterministic? (not great coverage, but it might help)
dschuff
reviewed
Dec 4, 2025
|
|
||
| // Collect all used symbol name indexes. | ||
| for (auto& func : wasm->functions) { | ||
| for (auto& pair : func->debugLocations) { |
Member
There was a problem hiding this comment.
could you even do something like [_, location] here instead of having to use pair.second?
Co-authored-by: Alon Zakai <alonzakai@gmail.com>
dschuff
approved these changes
Dec 5, 2025
Co-authored-by: Derek Schuff <dschuff@chromium.org>
kripken
approved these changes
Dec 5, 2025
Member
|
CI error might be fixed eventually by #8094 |
aheejin
added a commit
to emscripten-core/emscripten
that referenced
this pull request
Dec 9, 2025
This adds support for `names` field in source maps, which contains function names. Source map mappings are correspondingly updated and emsymbolizer now can provide function name information only with source maps. While source maps don't provide the full inlined hierarchies, this provides the name of the original (= pre-inlining) function, which may not exist in the final binary because they were inlined. This is because source maps are primarily intended for user debugging. This also demangles C++ function names using `llvm-cxxfilt`, so the printed names can be human-readable. I tested with `wasm-opt.wasm` from Binaryen by `if (EMSCRIPTEN)` setup here: https://github.com/WebAssembly/binaryen/blob/95b2cf0a4ab2386f099568c5c61a02163770af32/CMakeLists.txt#L311-L372 with `-g -gsource-map`. With this PR and WebAssembly/binaryen#8068, the source map file size increases by 3.5x (8632423 -> 30070042) primarily due to the function name strings. From `llvm-dwarfdump` output, this also requires additional parsing of `DW_TAG_subprogram` and `DW_TAG_inlined_subroutine` tags which can be at any depths (because functions can be within nested namespaces or classes), so we cannot use `--recurse-depth=0` (#9580) anymore. In case of `wasm-opt.wasm` built with DWARF info, without `--recurse-depth=0` in the command line, the size of its text output increased by 27.5x, but with the `--filter-child-tag` / `-t` option (llvm/llvm-project#165720), the text output increased only (?) by 3.2x, which I think is tolerable. This disables `names` field generation when `-t` option is not available in `llvm-dwarfdump` because it was added recently. To avoid this text size problem, we can consider using DWARF-parsing Python libraries like https://github.com/eliben/pyelftools, but this will make another third party dependency, so I'm not sure if it's worth it at this point. This also increased running time of `wasm-sourcemap.py`, in case of the `wasm-opt.wasm`, by 2.3x (6.6s -> 15.4s), but compared to the linking time this was not very noticeable. Fixes #20715 and closes #25116.
Member
|
This change in combination with emscripten-core/emscripten#25870 seems to have broken emscripten testing. |
aheejin
added a commit
to aheejin/binaryen
that referenced
this pull request
Dec 10, 2025
WebAssembly#8068 failed to update `prologLocation` and `epilogLocation`, which caused the CI failure: https://app.circleci.com/pipelines/github/emscripten-core/emscripten/47832/workflows/ea0292aa-124d-4a3f-b988-0a96823e9bcd/jobs/1089017/tests This updates them properly.
aheejin
added a commit
that referenced
this pull request
Dec 10, 2025
#8068 failed to update `prologLocation` and `epilogLocation`, which caused the CI failure: https://app.circleci.com/pipelines/github/emscripten-core/emscripten/47832/workflows/ea0292aa-124d-4a3f-b988-0a96823e9bcd/jobs/1089017/tests This updates them properly.
kripken
pushed a commit
to kripken/binaryen
that referenced
this pull request
Dec 10, 2025
This scans the code after optimizations and removes unused function names from 'names' field in the source map, reducing its size. Emscripten has not been generating 'names' field so far, but after emscripten-core/emscripten#25870, it will generate the field in case `llvm-dwarfdump` supports a new option `--filter-child-tag`.
kripken
pushed a commit
to kripken/binaryen
that referenced
this pull request
Dec 10, 2025
WebAssembly#8068 failed to update `prologLocation` and `epilogLocation`, which caused the CI failure: https://app.circleci.com/pipelines/github/emscripten-core/emscripten/47832/workflows/ea0292aa-124d-4a3f-b988-0a96823e9bcd/jobs/1089017/tests This updates them properly.
inolen
pushed a commit
to inolen/emscripten
that referenced
this pull request
Feb 13, 2026
This adds support for `names` field in source maps, which contains function names. Source map mappings are correspondingly updated and emsymbolizer now can provide function name information only with source maps. While source maps don't provide the full inlined hierarchies, this provides the name of the original (= pre-inlining) function, which may not exist in the final binary because they were inlined. This is because source maps are primarily intended for user debugging. This also demangles C++ function names using `llvm-cxxfilt`, so the printed names can be human-readable. I tested with `wasm-opt.wasm` from Binaryen by `if (EMSCRIPTEN)` setup here: https://github.com/WebAssembly/binaryen/blob/95b2cf0a4ab2386f099568c5c61a02163770af32/CMakeLists.txt#L311-L372 with `-g -gsource-map`. With this PR and WebAssembly/binaryen#8068, the source map file size increases by 3.5x (8632423 -> 30070042) primarily due to the function name strings. From `llvm-dwarfdump` output, this also requires additional parsing of `DW_TAG_subprogram` and `DW_TAG_inlined_subroutine` tags which can be at any depths (because functions can be within nested namespaces or classes), so we cannot use `--recurse-depth=0` (emscripten-core#9580) anymore. In case of `wasm-opt.wasm` built with DWARF info, without `--recurse-depth=0` in the command line, the size of its text output increased by 27.5x, but with the `--filter-child-tag` / `-t` option (llvm/llvm-project#165720), the text output increased only (?) by 3.2x, which I think is tolerable. This disables `names` field generation when `-t` option is not available in `llvm-dwarfdump` because it was added recently. To avoid this text size problem, we can consider using DWARF-parsing Python libraries like https://github.com/eliben/pyelftools, but this will make another third party dependency, so I'm not sure if it's worth it at this point. This also increased running time of `wasm-sourcemap.py`, in case of the `wasm-opt.wasm`, by 2.3x (6.6s -> 15.4s), but compared to the linking time this was not very noticeable. Fixes emscripten-core#20715 and closes emscripten-core#25116.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This scans the code after optimizations and removes unused function names from 'names' field in the source map, reducing its size. Emscripten has not been generating 'names' field so far, but after emscripten-core/emscripten#25870, it will generate the field in case
llvm-dwarfdumpsupports a new option--filter-child-tag.