Add Win32 PE platform support (PE32 / PE32+)#539
Conversation
New `src/splat/platforms/win32.py` (~1600 LOC, self-contained):
- Walks DOS stub, COFF file header, optional header (PE32 and PE32+),
section table, and every populated data directory:
0 Export, 1 Import, 2 Resource, 3 Exception (incl. PE32+ unwind
info opcode decode), 5 Base Relocation, 6 Debug (CodeView PDB
GUID / age extraction), 9 TLS callbacks, 10 Load Config
(/GS SecurityCookie, /SAFESEH handlers, /guard:cf table),
11 Bound Import, 13 Delay Import, 14 CLR Runtime Header.
- Also parses the deprecated COFF symbol table when the optional
header points at one (vintage MSVC 4-6 binaries).
- Defensive fuzz-cap on every iteration loop so a malformed PE
can't make the parser scan billions of records.
- Public helpers: sanitize_label, compute_iat_labels,
compute_export_labels (used by both create_config and the text
segtype to keep symbol_addrs.txt and disasm references in sync);
ptr_layout / resolve_exact_encoding shared across segtypes.
New `src/splat/disassembler/capstone_disassembler.py` (~110 LOC):
thin facade that picks CS_MODE_32 / CS_MODE_64 from
pe.is_pe32_plus, leaves engine creation lazy (target may not be
parsed yet when configure() runs), and exposes the same
known_types() primitive vocabulary spimdisasm uses so
symbol_addrs entries can use the conventional type:u32 /
type:asciz tokens.
Both modules are stand-alone — nothing else in this commit
imports them yet.
Eight segtype modules under src/splat/segtypes/win32/:
- header.py — emits a structured .section .header byte-by-byte
dump of the DOS stub + COFF + optional header (PE32 and PE32+
variants) + every data directory + the section table. Each field
renders as a width-correct .short / .long / .quad directive
with a trailing comment naming the field. A one-page human-
readable summary block (Machine / ImageBase / EntryPoint /
Subsystem / characteristics flags by name / sections / exports /
imports grouped by DLL / PDB GUID/age / TLS / resources / .NET CLR
fields) precedes the byte emission.
- text.py — two-pass Capstone disassembly. The first pass walks
every direct call / jmp <imm> target inside the segment to seed
function / branch labels (func_<va>, loc_<va>); the second
emits instructions with operand strings rewritten so addresses,
IAT slots, exports, and RIP-relative loads resolve to readable
labels. GAS-incompatible Capstone outputs (popal, xword ptr,
scalar SSE size qualifiers, riz/eiz SIB placeholders,
oversized enter immediates, etc.) are rewritten so the .s
output assembles cleanly. With exact_encoding: true the
instruction bytes are emitted verbatim with the decoded mnemonic
as a trailing comment — necessary for byte-identical round-trip
through GAS + objcopy.
- asm.py — alias so YAML can use type: asm (other splat
platforms' convention).
- data.py — heuristic emission of .data bytes:
- pointer slots flagged by base-relocations -> .long / .quad
with symbolic target labels (or raw hex when exact_encoding);
- NUL-terminated printable runs -> .asciz;
- UTF-16LE wide strings -> raw .byte plus a /* L"..." */
preview comment;
- long zero runs collapsed into .space N directives.
- rodata.py — alias on top of data.py with LINKER_SECTION = .rodata
and string-detection / pointer-heuristic on by default (read-only
data overwhelmingly contains strings or function-pointer tables).
- bss.py — NOLOAD reservation with .space N derived from the
YAML's bss_size: or vram_end - vram_start.
- bin.py — opaque marker class reusing CommonSegBin for sections
whose bytes are structured loader-time data (.rsrc / .reloc /
.idata / coff_symtab / signature).
- pdata.py — PE32+ exception directory: each RUNTIME_FUNCTION
record renders as a .long Begin, End, Unwind row with
func_<va> labels resolved by reflinking through the symbol
table. The unwind RVA is emitted symbolically as
(unwind_<va> - ImageBase) | 0x80000000 so the chained-record
flag stays correct. In exact_encoding mode the rows emit raw
hex RVAs for byte-identical reassembly. Each row's trailing
comment carries the decoded UNWIND_INFO opcode list (PUSH_NONVOL,
ALLOC_SMALL, SAVE_NONVOL, etc.) when one was found.
Behaviour is gated entirely off the win32 platform module added in
the previous commit — no changes to common splat segtypes.
… create_config
Splat-core hooks needed to expose the new win32 platform:
- util/options.py: accept win32 in the parse_opt_within list of
valid platforms. Default auto_link_sections to [] for
platform: win32 (the existing MIPS-style [.data, .rodata, .bss]
default generates phantom LinkerEntries on PE binaries because
each section is its own subsegment rather than an implicit
sibling of a base text segment).
- util/compiler.py: register MSVC2..14, MINGW, CLANG_LLD. All
share the same MASM-style asm config (.globl for symbols, no
end-label, no INCLUDE_ASM). Distinct names keep generated
configs documenting which toolchain produced the binary.
- util/file_presets.py: short-circuit for platform == "win32" in
write_assembly_inc_files — the win32 segtypes emit asm directly
with .section ... , "<flags>" headers and don't rely on the
macros.inc / labels.inc helpers.
- disassembler/disassembler_instance.py: route
platform == "win32" to the Capstone facade added in the earlier
commit.
- segtypes/__init__.py + platforms/__init__.py: re-export the new
win32 packages.
- scripts/create_config.py: new create_win32_config branch.
Detects PE files (MZ + PE magic) and emits a YAML + symbol_addrs
layout covering every parsed directory:
* Per-section subsegments (text / data / rodata / pdata / bss /
bin per the section characteristics; .reloc and .rsrc as bin).
* Entrypoint, exports (incl. forwarder comments), eager imports,
delay imports, TLS callbacks, SafeSEH handlers, /guard:cf
targets, /GS security cookie, .NET CLR pointers, unwind RVAs,
all as named symbol_addrs entries with type tags.
* MSVC version auto-detected from MajorLinkerVersion;
MinGW / Clang-LLD recognised via section names + import DLL
fingerprints.
* Post-section appendages (COFF symtab, Authenticode signature)
emitted as bin segments with a high-address sandbox VMA so
the linker doesn't fold them onto loaded sections.
* Sanitisation rules shared with the segtypes via
platforms.win32.sanitize_label so disassembly references
resolve to the same identifiers the YAML declares.
No behaviour change for existing platforms — all win32 logic is
gated behind platform == "win32".
New python -m splat.scripts.win32_reassemble <yaml> driver that
inverts splat split for win32 binaries.
Pipeline:
1. Run as on every .s under asm_path / data_path -> .o files at
the build_path layout the splat-generated linker script
expects (<build_path>/asm/<rel>.s.o by default, <rel>.o when
the YAML has o_as_suffix: True).
2. Wrap any .bin assets via objcopy -I binary -O elf... so they
can be linked in.
3. Invoke ld -N -m elf_i386 | elf_x86_64 -T <splat.ld> from
base_path. The splat linker script already places each section
at the right LMA / file-offset.
4. objcopy --set-section-flags .header=alloc,load,data (the
custom .header section starts READONLY-only from GAS),
then objcopy -O binary to extract the linked image —
the .header section already carries the full DOS+COFF+optional
header bytes, so the binary blob IS the reassembled PE.
With exact_encoding: true on the text/data/pdata subsegments, the
reassembled PE is byte-identical to the original. Verified on
PsExec / PsExec64 / PuTTY 0.60 / PuTTY 0.70 32-bit / PuTTY 0.83
64-bit (716 KB to 1.7 MB; vintage MSVC6 through modern MSVC14;
both PE32 and PE32+).
Coverage: - test_win32_pe.py (~5000 LOC, 199 unit tests): every PE parser branch (DOS/COFF/optional header, all data directories incl. fuzz-input edge cases — runt headers, fuzzed NumberOfRvaAndSizes, pathological string-table offsets, virtual-tail RVA rejection), label-generation helpers (sanitize_label / compute_iat_labels / compute_export_labels), every segtype's behaviour (header summary width adjustments for PE32+, exact_encoding inheritance, resource-only DLLs, all-forwarder shim DLLs, BSS-only PEs, phantom-pointer sections, etc.), CapstoneDisassembler engine selection, and Win32SegBin / Win32SegAsm marker semantics. - test.py: 10 end-to-end test methods covering split + assemble + byte-identical round-trip on PE32 and PE32+ synthetic fixtures, plus a win32_reassemble byte-identity test for each bitness. - test/win32_app/ + test/win32_app64/: synthetic PE32 / PE32+ fixtures with generate.py scripts that emit the binaries on the fly (so the suite is hermetic and no binary blobs are committed). - test-binaries/zoo/README.md: catalogue of 30 freely- redistributable PE binaries spanning 1995-2025 + ARM64, organised by era band (MSVC 4-6 through MSVC 14.x, MinGW, ScummVM, Sysinternals PSTools, PuTTY, etc.) with direct download URLs. Binaries themselves stay outside the repo via the .gitignore. - pyproject.toml: new win32 optional dependency group (capstone>=5.0.0). dev pulls it in. - README.md: surface the win32 platform support and the three user-facing scripts (splat split / create_config / win32_reassemble). - CHANGELOG.md: unreleased entry summarising the platform addition.
|
Hi! That's a lot of work, pretty impressive. Are you part of any existing win decomp communities? What are they currently using for their decomp work? Are you using this as part of your decomp workflow? What other tooling are you using? Would be nice if you shared it so we can take a look The x86_64 support surprises me, I haven't heard anyone doing windows x86_64 decomp so this is kind of new. Also, you mention in your post that you were able to rebuild a byte-by-byte identical PE binary. To me that's a bit hard to believe. Thanks for reaching out! |
|
Hi, thanks for taking a look. Specific target is Not really part of any broader win decomp community beyond that. Nothing splat-shaped exists for PE that I've found, which was part of why this seemed worth doing. On the tooling side I'm also working on https://github.com/maci0/rebrew, a compiler-in-the-loop matching workbench: GA engine for flag/source mutation to drive functions toward byte-exact, symbolic equivalence checks via angr+Z3 for NEAR_MATCHING cases, FLIRT against library identification, plus Ghidra sync via ReVa MCP. splat could slot in upstream of that, it's what could produce the initial per-TU skeleton rebrew then iterates on. x86_64 was almost an accident, the same parsing code path handles both and Capstone covers both, so once PE32 worked PE32+ mostly came for free. The genuinely different parts are .pdata and UNWIND_INFO, since that's what's new vs PE32. PuTTY 0.83 x64 (~1.7MB, 2410 RUNTIME_FUNCTIONs) was my stress test there. On byte-identical, fair to be skeptical. PE is structurally a lot friendlier than ELF for this though. No GOT, no PLT, no link-time inter-TU relocations. Import resolution happens at load time, so the import table is just static data in the binary. Sections land in file order and Some of our targets actually round-trip byte-identical with the default disasm (no One thing that did need thought: post-section appendages like the Authenticode signature blob or a high-offset COFF symtab. Those needed explicit high-vram entries in the yaml so the section layout doesn't collapse on them. PsExec is signed so that path's exercised end-to-end. Verified on PsExec / PsExec64 / PuTTY 0.60 / PuTTY 0.70 32-bit / PuTTY 0.83 64-bit, plus our actual There's a |
Summary
New `platform: win32` covering PE32 (x86) and PE32+ (x86_64) binaries. Self-contained — no changes to existing platform behaviour.
Split across 5 commits for review:
What's in scope
Parsing. Walks DOS stub + COFF header + optional header (both bitnesses) + every populated data directory: exports, imports, resources, exception table (with x64 SEH unwind-info opcode decode), base relocations, debug (CodeView PDB GUID/age extraction), TLS callbacks, load config (/GS cookie, /SAFESEH handlers, /guard:cf table), bound imports, delay imports, .NET CLR runtime header. Also the deprecated COFF symbol table when vintage MSVC4-6 binaries ship one. Every iteration loop has a defensive fuzz-cap.
Disassembly. Capstone-backed (`capstone>=5.0.0` as new optional dep group). Two-pass: first scan collects every direct `call`/`jmp` target inside the segment to seed function/branch labels; second pass emits operands rewritten so addresses, IAT slots, exports, and RIP-relative loads resolve to readable labels. GAS-incompatible Capstone outputs (`popal`, `xword ptr`, scalar SSE size qualifiers, `riz`/`eiz` placeholders, oversized `enter` immediates, etc.) rewritten so the `.s` output assembles cleanly.
Round-trip. With `exact_encoding: true` on the text/data/pdata subsegments, the reassembled PE is byte-identical to the original. Verified on 5 real-world binaries:
New driver: `python -m splat.scripts.win32_reassemble ` runs `as` + `ld` + `objcopy` against the splat-generated linker script + sections.
Tooling.
What's deferred (could be follow-ups)
Tests
```
$ python -m unittest test test_n64_entrypoints test_win32_pe
Ran 199 tests in 0.079s -> OK
$ python -m unittest test.Testing
Ran 10 tests in 0.062s -> OK
$ python -m mypy src/splat
Success: no issues found in 111 source files
```
`test_win32_pe.py` covers every parser branch + fuzz-input edge cases; `test.py` covers split + assemble + byte-identical round-trip end-to-end for both PE32 and PE32+ synthetic fixtures.
Test plan