Skip to content

bungcip/cendol

Repository files navigation

Cendol

codecov

Cendol is a C23 compiler implemented in Rust. It is a project to understand the process of building a compiler from scratch, focusing on high-performance compiler architecture and comprehensive C23 standard compliance.

Features

  • Full C23 Preprocessor: Complete preprocessor with macro expansion, conditional compilation, file inclusion, and built-in macros (__FILE__, __LINE__, etc.)
  • Lexer: Tokenization of C23 source code with proper handling of literals, keywords, and operators
  • Parser: Comprehensive C23 syntax parsing using Pratt parsing for expressions and recursive descent for statements
  • Semantic Analysis: Type checking, symbol resolution, and semantic validation
  • Code Generation: Compiles to native object code using Cranelift backend
  • Linker Integration: Automatic invocation of system linker (clang) to produce executables
  • Rich Diagnostics: Error reporting with source location tracking

Limitations

  • No Trigraph Support: Trigraphs (three-character sequences like ??=, ??<, etc.) are not supported. Note: Trigraphs were officially removed in C23.
  • No Digraph Support: Digraphs (two-character sequences like <:, :>, <%, %>, %:, %:%:) are not supported.
  • No K&R Function Declarations: Functions declared with an empty parameter list (e.g., int foo()) are treated as int foo(void), following C23.
  • Missing C23 Language Features:
    • Bit-precise integers (_BitInt(N)) are not yet implemented.
    • Decimal floating-point types (_Decimal32, _Decimal64, _Decimal128) are not supported.
    • C23 Attribute Syntax ([[...]]) is not yet parsed.
    • #embed Directive: The C23 resource inclusion directive is not implemented.
  • No Standard Library: Cendol does not provide its own libc and relies on the system's C library for headers and linking.

Architecture

Cendol follows a traditional multi-phase compiler architecture optimized for performance:

  1. Preprocessing Phase: Transforms C source with macro expansion and includes
  2. Lexing Phase: Converts preprocessed tokens to lexical tokens
  3. Parsing Phase: Builds a flattened Abstract Syntax Tree (AST)
  4. Semantic Analysis Phase: Performs type checking and symbol resolution
  5. MIR Generation: Lowers AST to Mid-level Intermediate Representation
  6. Code Generation: Generates native machine code via Cranelift
  7. Linking: Links object files to create the final executable

Getting Started

Prerequisites

  • Rust 2024 edition or later
  • Cargo
  • Clang (used as the system linker)

Building

To build the compiler, run:

cargo build

For release build with optimizations:

cargo build --release

Usage

To compile a C file to an executable:

cargo run -- -o <output_file> <input_file>

Other Options

  • -E: Preprocess only, output preprocessed source to stdout
  • -P: Suppress line markers in preprocessor output
  • -C: Retain comments in preprocessor output
  • -I <path>: Add include search path
  • -D <name>[=<value>]: Define preprocessor macro
  • --verbose: Enable verbose diagnostic output

Examples

Preprocess a file:

cargo run -- -E test.c

Define macros and include paths:

cargo run -- -D DEBUG=1 -I /usr/include test.c

Design Documents

Comprehensive design documentation is available in the design-document/ directory:

Contributing

This is a learning project, but contributions are welcome! Areas of interest include:

  • Additional C23 language features
  • Performance optimizations
  • Testing and bug fixes
  • Documentation improvements

AI-Friendly Contributions

This project is AI-friendly and welcomes contributions from developers using AI tools. We encourage the use of AI for code generation, debugging, and documentation to enhance productivity.

License

See LICENSE file for details.

About

c compiler in rust

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages