Skip to content

Add Win32 PE platform support (PE32 / PE32+)#539

Open
maci0 wants to merge 5 commits into
ethteck:mainfrom
maci0:add-win32-platform
Open

Add Win32 PE platform support (PE32 / PE32+)#539
maci0 wants to merge 5 commits into
ethteck:mainfrom
maci0:add-win32-platform

Conversation

@maci0
Copy link
Copy Markdown

@maci0 maci0 commented May 20, 2026

Summary

New `platform: win32` covering PE32 (x86) and PE32+ (x86_64) binaries. Self-contained — no changes to existing platform behaviour.

Split across 5 commits for review:

  1. `win32: add PE32 / PE32+ parser and Capstone disassembler facade`
  2. `win32: add segtypes (header, text/asm, data, rodata, bss, bin, pdata)`
  3. `win32: wire platform into options / compiler / disassembler-factory / create_config`
  4. `win32: add win32_reassemble post-process script`
  5. `win32: tests, docs, fixtures, pyproject extras`

What's in scope

Parsing. Walks DOS stub + COFF header + optional header (both bitnesses) + every populated data directory: exports, imports, resources, exception table (with x64 SEH unwind-info opcode decode), base relocations, debug (CodeView PDB GUID/age extraction), TLS callbacks, load config (/GS cookie, /SAFESEH handlers, /guard:cf table), bound imports, delay imports, .NET CLR runtime header. Also the deprecated COFF symbol table when vintage MSVC4-6 binaries ship one. Every iteration loop has a defensive fuzz-cap.

Disassembly. Capstone-backed (`capstone>=5.0.0` as new optional dep group). Two-pass: first scan collects every direct `call`/`jmp` target inside the segment to seed function/branch labels; second pass emits operands rewritten so addresses, IAT slots, exports, and RIP-relative loads resolve to readable labels. GAS-incompatible Capstone outputs (`popal`, `xword ptr`, scalar SSE size qualifiers, `riz`/`eiz` placeholders, oversized `enter` immediates, etc.) rewritten so the `.s` output assembles cleanly.

Round-trip. With `exact_encoding: true` on the text/data/pdata subsegments, the reassembled PE is byte-identical to the original. Verified on 5 real-world binaries:

  • Sysinternals PsExec (MSVC 14 PE32, 716 KB)
  • Sysinternals PsExec64 (MSVC 14 PE32+, 833 KB, 1049 RUNTIME_FUNCTIONs)
  • PuTTY 0.60 (vintage MSVC 6 PE32, 455 KB)
  • PuTTY 0.70 32-bit (MSVC 14 with `.00cfg` CFG section, 774 KB)
  • PuTTY 0.83 64-bit (MSVC 14 PE32+, 1.7 MB, 2410 RUNTIME_FUNCTIONs)

New driver: `python -m splat.scripts.win32_reassemble ` runs `as` + `ld` + `objcopy` against the splat-generated linker script + sections.

Tooling.

  • `create_config` auto-detects PE (MZ + PE magic) and emits a YAML + `symbol_addrs.txt` with named symbols for every relevant directory entry (entrypoint, exports incl. forwarders, eager + delay imports, TLS callbacks, SafeSEH handlers, /guard:cf targets, /GS security cookie, .NET CLR pointers, unwind RVAs).
  • 11 new compiler tags registered: `MSVC2..14`, `MINGW`, `CLANG_LLD`. All share the same MASM-style asm config; distinct names preserve provenance.
  • MSVC version auto-detected from MajorLinkerVersion; MinGW / Clang-LLD recognised via section names + import DLL fingerprints.

What's deferred (could be follow-ups)

  • ARM32 / ARM64 disassembly — parser rejects with friendly error; needs non-x86 Capstone backend
  • PDB content parsing — only filename + GUID/age surfaced (needs MSF / CodeView reader)
  • CLR metadata table decoding — only header decoded (needs ECMA-335 reader)
  • HIGHADJ / HIGH+LOW base relocations — skipped, only HIGHLOW + DIR64 consumed
  • Resource extraction to per-leaf bin files — leaves enumerated but not auto-split

Tests

```
$ python -m unittest test test_n64_entrypoints test_win32_pe
Ran 199 tests in 0.079s -> OK

$ python -m unittest test.Testing
Ran 10 tests in 0.062s -> OK

$ python -m mypy src/splat
Success: no issues found in 111 source files
```

`test_win32_pe.py` covers every parser branch + fuzz-input edge cases; `test.py` covers split + assemble + byte-identical round-trip end-to-end for both PE32 and PE32+ synthetic fixtures.

Test plan

  • All 199 unit tests pass
  • All 10 end-to-end tests pass (split + assemble + byte-identical round-trip)
  • mypy clean across 111 source files
  • Real-binary round-trip on 5 zoo binaries (PsExec / PsExec64 / 3× PuTTY) — all byte-identical
  • No regressions in N64 / PSX / PS2 / PSP behaviour (existing test.py + test_n64_entrypoints all pass)
  • CI runs against a fresh checkout

maci0 added 5 commits May 20, 2026 11:55
New `src/splat/platforms/win32.py` (~1600 LOC, self-contained):

- Walks DOS stub, COFF file header, optional header (PE32 and PE32+),
  section table, and every populated data directory:
    0 Export, 1 Import, 2 Resource, 3 Exception (incl. PE32+ unwind
    info opcode decode), 5 Base Relocation, 6 Debug (CodeView PDB
    GUID / age extraction), 9 TLS callbacks, 10 Load Config
    (/GS SecurityCookie, /SAFESEH handlers, /guard:cf table),
    11 Bound Import, 13 Delay Import, 14 CLR Runtime Header.
- Also parses the deprecated COFF symbol table when the optional
  header points at one (vintage MSVC 4-6 binaries).
- Defensive fuzz-cap on every iteration loop so a malformed PE
  can't make the parser scan billions of records.
- Public helpers: sanitize_label, compute_iat_labels,
  compute_export_labels (used by both create_config and the text
  segtype to keep symbol_addrs.txt and disasm references in sync);
  ptr_layout / resolve_exact_encoding shared across segtypes.

New `src/splat/disassembler/capstone_disassembler.py` (~110 LOC):
thin facade that picks CS_MODE_32 / CS_MODE_64 from
pe.is_pe32_plus, leaves engine creation lazy (target may not be
parsed yet when configure() runs), and exposes the same
known_types() primitive vocabulary spimdisasm uses so
symbol_addrs entries can use the conventional type:u32 /
type:asciz tokens.

Both modules are stand-alone — nothing else in this commit
imports them yet.
Eight segtype modules under src/splat/segtypes/win32/:

- header.py — emits a structured .section .header byte-by-byte
  dump of the DOS stub + COFF + optional header (PE32 and PE32+
  variants) + every data directory + the section table. Each field
  renders as a width-correct .short / .long / .quad directive
  with a trailing comment naming the field. A one-page human-
  readable summary block (Machine / ImageBase / EntryPoint /
  Subsystem / characteristics flags by name / sections / exports /
  imports grouped by DLL / PDB GUID/age / TLS / resources / .NET CLR
  fields) precedes the byte emission.

- text.py — two-pass Capstone disassembly. The first pass walks
  every direct call / jmp <imm> target inside the segment to seed
  function / branch labels (func_<va>, loc_<va>); the second
  emits instructions with operand strings rewritten so addresses,
  IAT slots, exports, and RIP-relative loads resolve to readable
  labels. GAS-incompatible Capstone outputs (popal, xword ptr,
  scalar SSE size qualifiers, riz/eiz SIB placeholders,
  oversized enter immediates, etc.) are rewritten so the .s
  output assembles cleanly. With exact_encoding: true the
  instruction bytes are emitted verbatim with the decoded mnemonic
  as a trailing comment — necessary for byte-identical round-trip
  through GAS + objcopy.

- asm.py — alias so YAML can use type: asm (other splat
  platforms' convention).

- data.py — heuristic emission of .data bytes:
    - pointer slots flagged by base-relocations -> .long / .quad
      with symbolic target labels (or raw hex when exact_encoding);
    - NUL-terminated printable runs -> .asciz;
    - UTF-16LE wide strings -> raw .byte plus a /* L"..." */
      preview comment;
    - long zero runs collapsed into .space N directives.

- rodata.py — alias on top of data.py with LINKER_SECTION = .rodata
  and string-detection / pointer-heuristic on by default (read-only
  data overwhelmingly contains strings or function-pointer tables).

- bss.py — NOLOAD reservation with .space N derived from the
  YAML's bss_size: or vram_end - vram_start.

- bin.py — opaque marker class reusing CommonSegBin for sections
  whose bytes are structured loader-time data (.rsrc / .reloc /
  .idata / coff_symtab / signature).

- pdata.py — PE32+ exception directory: each RUNTIME_FUNCTION
  record renders as a .long Begin, End, Unwind row with
  func_<va> labels resolved by reflinking through the symbol
  table. The unwind RVA is emitted symbolically as
  (unwind_<va> - ImageBase) | 0x80000000 so the chained-record
  flag stays correct. In exact_encoding mode the rows emit raw
  hex RVAs for byte-identical reassembly. Each row's trailing
  comment carries the decoded UNWIND_INFO opcode list (PUSH_NONVOL,
  ALLOC_SMALL, SAVE_NONVOL, etc.) when one was found.

Behaviour is gated entirely off the win32 platform module added in
the previous commit — no changes to common splat segtypes.
… create_config

Splat-core hooks needed to expose the new win32 platform:

- util/options.py: accept win32 in the parse_opt_within list of
  valid platforms. Default auto_link_sections to [] for
  platform: win32 (the existing MIPS-style [.data, .rodata, .bss]
  default generates phantom LinkerEntries on PE binaries because
  each section is its own subsegment rather than an implicit
  sibling of a base text segment).

- util/compiler.py: register MSVC2..14, MINGW, CLANG_LLD. All
  share the same MASM-style asm config (.globl for symbols, no
  end-label, no INCLUDE_ASM). Distinct names keep generated
  configs documenting which toolchain produced the binary.

- util/file_presets.py: short-circuit for platform == "win32" in
  write_assembly_inc_files — the win32 segtypes emit asm directly
  with .section ... , "<flags>" headers and don't rely on the
  macros.inc / labels.inc helpers.

- disassembler/disassembler_instance.py: route
  platform == "win32" to the Capstone facade added in the earlier
  commit.

- segtypes/__init__.py + platforms/__init__.py: re-export the new
  win32 packages.

- scripts/create_config.py: new create_win32_config branch.
  Detects PE files (MZ + PE magic) and emits a YAML + symbol_addrs
  layout covering every parsed directory:
    * Per-section subsegments (text / data / rodata / pdata / bss /
      bin per the section characteristics; .reloc and .rsrc as bin).
    * Entrypoint, exports (incl. forwarder comments), eager imports,
      delay imports, TLS callbacks, SafeSEH handlers, /guard:cf
      targets, /GS security cookie, .NET CLR pointers, unwind RVAs,
      all as named symbol_addrs entries with type tags.
    * MSVC version auto-detected from MajorLinkerVersion;
      MinGW / Clang-LLD recognised via section names + import DLL
      fingerprints.
    * Post-section appendages (COFF symtab, Authenticode signature)
      emitted as bin segments with a high-address sandbox VMA so
      the linker doesn't fold them onto loaded sections.
    * Sanitisation rules shared with the segtypes via
      platforms.win32.sanitize_label so disassembly references
      resolve to the same identifiers the YAML declares.

No behaviour change for existing platforms — all win32 logic is
gated behind platform == "win32".
New python -m splat.scripts.win32_reassemble <yaml> driver that
inverts splat split for win32 binaries.

Pipeline:
  1. Run as on every .s under asm_path / data_path -> .o files at
     the build_path layout the splat-generated linker script
     expects (<build_path>/asm/<rel>.s.o by default, <rel>.o when
     the YAML has o_as_suffix: True).
  2. Wrap any .bin assets via objcopy -I binary -O elf... so they
     can be linked in.
  3. Invoke ld -N -m elf_i386 | elf_x86_64 -T <splat.ld> from
     base_path. The splat linker script already places each section
     at the right LMA / file-offset.
  4. objcopy --set-section-flags .header=alloc,load,data (the
     custom .header section starts READONLY-only from GAS),
     then objcopy -O binary to extract the linked image —
     the .header section already carries the full DOS+COFF+optional
     header bytes, so the binary blob IS the reassembled PE.

With exact_encoding: true on the text/data/pdata subsegments, the
reassembled PE is byte-identical to the original. Verified on
PsExec / PsExec64 / PuTTY 0.60 / PuTTY 0.70 32-bit / PuTTY 0.83
64-bit (716 KB to 1.7 MB; vintage MSVC6 through modern MSVC14;
both PE32 and PE32+).
Coverage:

- test_win32_pe.py (~5000 LOC, 199 unit tests): every PE parser
  branch (DOS/COFF/optional header, all data directories incl.
  fuzz-input edge cases — runt headers, fuzzed NumberOfRvaAndSizes,
  pathological string-table offsets, virtual-tail RVA rejection),
  label-generation helpers (sanitize_label / compute_iat_labels /
  compute_export_labels), every segtype's behaviour (header summary
  width adjustments for PE32+, exact_encoding inheritance,
  resource-only DLLs, all-forwarder shim DLLs, BSS-only PEs,
  phantom-pointer sections, etc.), CapstoneDisassembler engine
  selection, and Win32SegBin / Win32SegAsm marker semantics.

- test.py: 10 end-to-end test methods covering split + assemble +
  byte-identical round-trip on PE32 and PE32+ synthetic fixtures,
  plus a win32_reassemble byte-identity test for each bitness.

- test/win32_app/ + test/win32_app64/: synthetic PE32 / PE32+
  fixtures with generate.py scripts that emit the binaries on the
  fly (so the suite is hermetic and no binary blobs are committed).

- test-binaries/zoo/README.md: catalogue of 30 freely-
  redistributable PE binaries spanning 1995-2025 + ARM64, organised
  by era band (MSVC 4-6 through MSVC 14.x, MinGW, ScummVM,
  Sysinternals PSTools, PuTTY, etc.) with direct download URLs.
  Binaries themselves stay outside the repo via the .gitignore.

- pyproject.toml: new win32 optional dependency group
  (capstone>=5.0.0). dev pulls it in.

- README.md: surface the win32 platform support and the three
  user-facing scripts (splat split / create_config /
  win32_reassemble).

- CHANGELOG.md: unreleased entry summarising the platform
  addition.
@maci0 maci0 force-pushed the add-win32-platform branch from 88147d9 to 6e12673 Compare May 20, 2026 03:56
@AngheloAlf
Copy link
Copy Markdown
Collaborator

Hi!

That's a lot of work, pretty impressive.
I see a lot of technical discussion on your PR, but you don't mention what game are you interested on working that lead you to add this.

Are you part of any existing win decomp communities? What are they currently using for their decomp work?

Are you using this as part of your decomp workflow? What other tooling are you using? Would be nice if you shared it so we can take a look

The x86_64 support surprises me, I haven't heard anyone doing windows x86_64 decomp so this is kind of new.

Also, you mention in your post that you were able to rebuild a byte-by-byte identical PE binary. To me that's a bit hard to believe.
In the ELF world we have struggled a lot to get matching ELFs in many different platforms.
Maybe PE is a lot less of a problem, but I still remain skeptical.
Could you elaborate on this approach please?

Thanks for reaching out!

@maci0
Copy link
Copy Markdown
Author

maci0 commented May 21, 2026

Hi, thanks for taking a look.

Specific target is server.dll from Europa 1400 (The Guild Gold, 4HEAD Studios, ~2002). It's the multiplayer/networking module, MSVC6-era PE32. I'm part of https://github.com/europa1400-community, we've been chipping at the game on and off for a while and I wanted a way to split the binary into TUs and iterate on a matching rebuild instead of the usual Ghidra-export + paste workflow.

Not really part of any broader win decomp community beyond that. Nothing splat-shaped exists for PE that I've found, which was part of why this seemed worth doing.

On the tooling side I'm also working on https://github.com/maci0/rebrew, a compiler-in-the-loop matching workbench: GA engine for flag/source mutation to drive functions toward byte-exact, symbolic equivalence checks via angr+Z3 for NEAR_MATCHING cases, FLIRT against library identification, plus Ghidra sync via ReVa MCP. splat could slot in upstream of that, it's what could produce the initial per-TU skeleton rebrew then iterates on.

x86_64 was almost an accident, the same parsing code path handles both and Capstone covers both, so once PE32 worked PE32+ mostly came for free. The genuinely different parts are .pdata and UNWIND_INFO, since that's what's new vs PE32. PuTTY 0.83 x64 (~1.7MB, 2410 RUNTIME_FUNCTIONs) was my stress test there.

On byte-identical, fair to be skeptical. PE is structurally a lot friendlier than ELF for this though. No GOT, no PLT, no link-time inter-TU relocations. Import resolution happens at load time, so the import table is just static data in the binary. Sections land in file order and ld preserves that via the linker script.

Some of our targets actually round-trip byte-identical with the default disasm (no exact_encoding needed) because nothing in the code triggers label substitution in operand encoding. exact_encoding: true on text/data/pdata is the safety net for the ones that do, it makes disasm emit the literal byte sequence rather than label-rewritten operands. Reassemble with as + ld + objcopy -O binary. Headers, .rdata pointers, IAT slots, .reloc blocks, .pdata records all reproduce because they're written back as opaque byte arrays, not regenerated from semantic info.

One thing that did need thought: post-section appendages like the Authenticode signature blob or a high-offset COFF symtab. Those needed explicit high-vram entries in the yaml so the section layout doesn't collapse on them. PsExec is signed so that path's exercised end-to-end.

Verified on PsExec / PsExec64 / PuTTY 0.60 / PuTTY 0.70 32-bit / PuTTY 0.83 64-bit, plus our actual server.dll target, all round-trip byte-identical including the embedded signature on the signed ones.

There's a python -m splat.scripts.win32_reassemble <yaml> script in the PR that wraps the as/ld/objcopy invocation. Happy to walk through anything specific

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants