Skip to content

elf_parser: clamp overlapping function boundaries from external sources#108

Closed
jlsandri wants to merge 1 commit intoran-j:mainfrom
jlsandri:pr/elf-parser-clamp-overlapping-boundaries
Closed

elf_parser: clamp overlapping function boundaries from external sources#108
jlsandri wants to merge 1 commit intoran-j:mainfrom
jlsandri:pr/elf-parser-clamp-overlapping-boundaries

Conversation

@jlsandri
Copy link
Copy Markdown

@jlsandri jlsandri commented Apr 6, 2026

Problem

When function boundaries come from an external source (e.g. a Ghidra auto-analysis export), the reported ranges can overlap: a parent function's length is reported as including its nested sub-functions. A concrete example seen in the wild is a 52-byte function reported as 456 bytes because Ghidra's auto-analysis folded nested sub_xxx children into the parent's size.

When the recompiler imports these, extractFunctions() happily keeps both the oversized parent and the nested children. The parent then gets code-generated with all the sub-function bodies inlined, and the per-call PC-mismatch safety checks inside the inlined sub-calls cascade early-returns up the entire call chain at runtime.

Fix

After merging all function sources in extractFunctions(), walk the sorted function list and clamp each function's end to the start of the next function. An exemption is carved out when the gap is < 16 bytes so that the legitimate "entry-slice" pattern (multiple entry points into a single logical function, spaced a few instructions apart) is preserved.

Scope

  • One file touched: ps2xRecomp/src/lib/elf_parser.cpp
  • +21 / -0
  • No behavioural change for well-formed function maps — the clamp only fires when the imported data is actually inconsistent.

Rationale

This is purely defensive. The recompiler already expects non-overlapping function ranges; this commit enforces that invariant at import time instead of letting the downstream codegen produce wrong output silently.

Ghidra's auto-analysis can produce overlapping function ranges where a
parent function's reported length includes nested sub-functions (e.g. a
parent reported as 456 bytes when it should be 52). The recompiler then
emits one giant body containing sub-functions inline, and pc-mismatch
safety checks on sub-calls cause cascading early returns.

After merging all function sources in extractFunctions(), clamp each
function's end to the next function's start — unless the gap is < 16
bytes (entry slice pattern).
@jlsandri jlsandri force-pushed the pr/elf-parser-clamp-overlapping-boundaries branch from b8b46b7 to def21a5 Compare April 6, 2026 08:56
@jlsandri
Copy link
Copy Markdown
Author

jlsandri commented Apr 6, 2026

Closing as part of a batch cleanup after #107 landed. The runtime ecosystem refactor in #107 substantially reworked the files this PR touched, and I would like to re-audit the underlying fix against the new code structure before putting it back in front of you. If the fix is still needed after that re-audit, I will re-open as a focused PR rebased onto current main. Thanks for your patience.

@jlsandri jlsandri closed this Apr 6, 2026
@jlsandri jlsandri deleted the pr/elf-parser-clamp-overlapping-boundaries branch April 6, 2026 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant