Skip to content

seldak/rld

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rld — a toy static linker for ELF x86_64 (Rust)

rld is a small, opinionated static linker written in Rust. It links ELF relocatable objects (ET_REL, x86_64) into a runnable Linux ELF executable (ET_EXEC) by:

  • parsing .o files (ELF)
  • resolving global symbols
  • laying out .text and .data
  • applying a minimal set of AMD64 relocations
  • emitting a minimal ELF64 executable (two PT_LOAD segments)

This project is intentionally scoped to be understandable and hackable: the goal is to learn and demonstrate linker fundamentals, not to replace ld.


Status

Works end-to-end for small non-PIE, non-libc examples: ./a.out; echo $? returns the linked program's exit code.

Supported (current)

  • Input: ELF64 ET_REL, EM_X86_64
  • Sections: .text, .data (and .bss ignored for now)
  • Relocations (x86_64):
    • R_X86_64_PLT32 (call rel32): S + A - P → i32
    • R_X86_64_PC32: S + A - P → i32
    • R_X86_64_32S: S + A → i32
    • R_X86_64_64: S + A → u64
  • Output:
    • --emit raw: debug container [text_size u64][data_size u64][text][data]
    • --emit elf: minimal ELF64 ET_EXEC with 2 load segments (RX text, RW data), no section headers

Not supported (yet)

  • PIC / PIE (ET_DYN)
  • dynamic linking / PLT/GOT resolution
  • .bss emission / COMMON symbols
  • merging more sections (.rodata, etc.)
  • .eh_frame / unwind metadata (compile inputs with unwind disabled)

Quick start

Build

cargo build

Create sample inputs

The simplest demo links two objects:

  • a.o defines x and foo()
  • b.o defines _start and exits with the return value of foo()

Use these compile flags to avoid .eh_frame / unwind complexity early:

clang -c -O0 -fno-pic -nostdlib -fno-asynchronous-unwind-tables -fno-unwind-tables inputs/a.c -o inputs/a.o
clang -c -O0 -fno-pic -nostdlib -fno-asynchronous-unwind-tables -fno-unwind-tables inputs/b.c -o inputs/b.o

A minimal _start that performs a Linux syscall exit:

// inputs/b.c
extern int foo(void);

__attribute__((noreturn))
void _start(void) {
    long code = foo();
    __asm__ volatile(
        "mov $60, %%rax\n"   // SYS_exit
        "mov %0, %%rdi\n"
        "syscall\n"
        :
        : "r"(code)
        : "%rax", "%rdi"
    );
    __builtin_unreachable();
}

Inspect objects

cargo run -- dump inputs/a.o
cargo run -- dump inputs/b.o

Link to a runnable ELF

cargo run -- link inputs/a.o inputs/b.o -o a.out --emit elf
chmod +x a.out
./a.out
echo $?
# expected: 8

Link to a raw debug container (optional)

cargo run -- link inputs/a.o inputs/b.o -o out.bin --emit raw
xxd -g 1 -l 128 out.bin
cargo run -- hexdump out.bin --from 0 --len 128

Commands

rld dump <file.o>

Prints ELF headers, sections, symbols, and relocations. Useful for understanding what the compiler generated.

rld link <a.o> <b.o> ... -o <output> [--emit raw|elf]

Links multiple objects.

  • --emit raw outputs the debug container.
  • --emit elf outputs a runnable Linux ELF64 ET_EXEC.

You can also override bases:

  • --text-base 0x401000
  • --data-base 0x402000
rld hexdump <file> [--from N] [--len N] [--width N]

Minimal hexdump for quick sanity checks.

Architecture

High-level pipeline

flowchart TD
  A["ELF .o files (ET_REL)"] --> B["Parse sections / symbols / relocs"]
  B --> C[Layout output .text/.data]
  C --> D[Build global symbol table]
  D --> E[Finalize symbol addresses]
  E --> F[Apply relocations into output buffers]
  F --> G{Emit format}
  G -->|raw| H[Debug container out.bin]
  G -->|elf| I[ELF ET_EXEC a.out]
Loading

Module diagram (current)

graph LR
  main[main.rs] --> cli[cli.rs]
  main --> link[link.rs]
  main --> emit[emit.rs]
  main --> dump[elf_read.rs]
  main --> hex[hexdump.rs]
  link --> model[model.rs]
  emit --> link
Loading

Core data structures (conceptual)

classDiagram
  class InputObject {
    path: PathBuf
    text: Vec<u8>
    data: Vec<u8>
    out_text_off: u64
    out_data_off: u64
    text_shndx: Option<u16>
    data_shndx: Option<u16>
    symbols: Vec<InSymbol>
    relas_text: Vec<Rela>
  }

  class InSymbol {
    name: String
    bind: u8
    shndx: u16
    value: u64
    size: u64
  }

  class Rela {
    offset: u64
    sym: u32
    rtype: u32
    addend: i64
  }

  InputObject --> InSymbol
  InputObject --> Rela
Loading

Notes on relocations (what gets patched)

rld implements relocation equations as defined by the AMD64 psABI. In practice:

  • PC-relative call/jump style relocations:

    • R_X86_64_PLT32, R_X86_64_PC32
    • patched value: S + A - P written as signed 32-bit
  • Absolute relocations:

    • R_X86_64_32S: S + A signed 32-bit
    • R_X86_64_64: S + A 64-bit

Where:

  • S: resolved symbol address
  • A: relocation addend (from RELA)
  • P: address of relocation place (the patch site)

Roadmap / next steps

If you want to keep extending this into a more “real” linker:

  • .rodata support and proper section merging by flags
  • ar archives (.a) and selective member extraction
  • Better error messages (multiple-def, undefined symbol listing)
  • Optional: section headers + symbol table emission in output ELF
  • PIE (ET_DYN) and RIP-relative data via GOT (bigger project)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors