Skip to content

Conversation

@LunaStev
Copy link
Member

@LunaStev LunaStev commented Mar 29, 2025

This PR transforms the current project into a library-based architecture and introduces core functionality for building and compiling WIR (Wave Intermediate Representation) into ELF binaries.

Key Changes

  • Refactored the project structure to be a library project
  • Added wir module: includes builder logic, type system, and instruction representation
  • Introduced and applied the Value type throughout wir
  • Implemented logic to insert string literals into the .data section and print them via syscall
  • Added basic .text section setup and program termination logic for ELF
  • Added NASM compilation and binary conversion handling
  • Included test-related code such as dummy, instruction, etc.

Summary by CodeRabbit

  • New Features

    • Introduced a new startup behavior displaying a greeting message.
    • Enabled code compilation from an abstract syntax tree into an executable binary, incorporating enhanced control flow functionality.
    • Provided streamlined assembly generation and ELF binary output.
  • Refactor

    • Standardized the package naming for consistency.
    • Streamlined code generation and module structure for improved maintainability.

@LunaStev LunaStev self-assigned this Mar 29, 2025
@coderabbitai
Copy link

coderabbitai bot commented Mar 29, 2025

Walkthrough

This pull request updates multiple modules to adopt a more functional style. The package name in Cargo.toml is updated for casing consistency, and a new entry point is introduced in bin/main.rs. Module visibility changes are applied in src/codegen/mod.rs and src/elf/mod.rs. The code generation logic in src/codegen/x86_64.rs and ELF writing in src/elf/writer.rs have been refactored to use standalone functions. Additionally, the project now introduces new AST and WIR components in the src/dummy.rs and src/wir/ modules while removing the previous object-oriented designs.

Changes

File(s) Summary of Changes
Cargo.toml Updated package name casing from "Whale" to "whale".
bin/main.rs Added a new main function that prints "Hello, world!" as the initial program entry point.
src/codegen/mod.rs,
src/codegen/x86_64.rs
In mod.rs, changed mod x86_64; to pub(crate) mod x86_64;. In x86_64.rs, removed the X86_64Codegen struct and its constructor, replaced with pub fn generate(module: &WIRModule) -> String and added a helper fn generate_print(format: &str, _args: &Vec<crate::wir::value::Value>) -> String.
src/dummy.rs Introduced a new file defining public entities: enums Expr and Statement, and the struct AST for representing an abstract syntax tree.
src/elf/mod.rs,
src/elf/writer.rs
In mod.rs, changed mod writer; to pub(crate) mod writer;. In writer.rs, removed the ELFWriter struct and its methods; added the standalone function pub fn write_elf(asm: &str, output_name: &str) that writes assembly to an ELF binary.
src/lib.rs Added pub fn compile_from_ast(ast: AST) which builds a WIR module from an AST, generates assembly code, and writes an ELF binary named "output".
src/main.rs Deleted the old main file which contained the previous entry point and module declarations.
src/wir/builder.rs,
src/wir/instruction.rs,
src/wir/mod.rs,
src/wir/value.rs
In builder.rs, removed the WIRBuilder struct and added pub fn build(ast: AST) -> WIRModule. In instruction.rs, added new enum variants for Print, Store, If, While, For, and Return. In mod.rs, introduced public modules (instruction, value, builder) and defined the WIRModule struct with methods new, push, and get. In value.rs, replaced the previous struct with enums WIRType and Value.

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant C as compile_from_ast (lib.rs)
    participant B as WIR Builder (build)
    participant G as X86_64 Codegen (generate)
    participant E as ELF Writer (write_elf)

    U->>C: Provide AST
    C->>B: build(ast)
    B-->>C: WIRModule
    C->>G: generate(WIRModule)
    G-->>C: Assembly code (String)
    C->>E: write_elf(assembly, "output")
    E-->>C: ELF binary generated
Loading

Poem

I'm a hopping rabbit, full of cheer,
New functions bloom, the code is clear,
ASTs and modules dance in tune,
Assembly lines sing a magical rune,
From ELF to binary, bytes take flight,
CodeRabbit delights in changes bright!
🐰✨

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (16)
bin/main.rs (1)

1-3: Consider adding a more comprehensive example that demonstrates library functionality

While this simple "Hello, world!" example works as a basic entry point, it doesn't showcase any of the new functionality mentioned in the PR objectives (WIR, ELF output, etc.). Consider enhancing this main function to demonstrate how to use the new library features you've implemented.

 fn main() {
-    println!("Hello, world!");
+    // Example: Using the WIR builder and compiling to ELF
+    let ast = whale::dummy::create_simple_ast();
+    let output_path = "output.elf";
+    
+    match whale::compile_from_ast(&ast, output_path) {
+        Ok(_) => println!("Successfully compiled to {}", output_path),
+        Err(e) => eprintln!("Compilation failed: {}", e),
+    }
 }
src/lib.rs (2)

9-13: Consider parameterizing the output file path

The compile_from_ast function currently hardcodes the output filename as "output". To improve flexibility, consider making this a parameter.

-pub fn compile_from_ast(ast: AST) {
+pub fn compile_from_ast(ast: AST, output_path: &str) {
    let wir = wir::builder::build(ast);
    let asm = codegen::x86_64::generate(&wir);
-    elf::writer::write_elf(&asm, "output");
+    elf::writer::write_elf(&asm, output_path);
}

1-8: Add module documentation

Adding documentation for the modules and public function would improve code readability and help users understand the purpose of each component.

+//! Library for compiling AST to machine code via WIR (Wave Intermediate Representation)
+//! and generating ELF executables.
+
 use crate::dummy::AST;

+/// Module containing the Wave Intermediate Representation components
 mod wir;
+/// Module for code generation from WIR to assembly
 mod codegen;
+/// Module for writing assembly to ELF format
 mod elf;
+/// Utility functions
 mod utils;
+/// Module containing AST data structures (temporary implementation)
 mod dummy;
src/wir/value.rs (3)

1-12: Add documentation for WIRType enum

Adding documentation would help clarify the purpose and usage of this type enumeration.

+/// Represents the type system for the Wave Intermediate Representation (WIR).
+/// These types are used to define the structure of data within the WIR.
 #[derive(Debug, Clone)]
 pub enum WIRType {
+    /// 32-bit signed integer
     I32,
+    /// 32-bit unsigned integer
     U32,
+    /// 32-bit floating point
     F32,
+    /// String type
     Str,
+    /// Boolean type
     Bool,
+    /// Character type
     Char,
+    /// Byte type (8-bit unsigned integer)
     Byte,
+    /// Pointer to another type
     Ptr(Box<WIRType>),
+    /// Array of elements of the same type
     Array(Box<WIRType>),
 }

14-26: Add documentation and methods for Value enum

Adding documentation and implementing methods to work with values would improve usability.

+/// Represents concrete values in the Wave Intermediate Representation (WIR).
+/// These values correspond to the types defined in `WIRType`.
 #[derive(Debug, Clone)]
 pub enum Value {
     Integer(i32),
     Unsigned(u32),
     Float(f32),
     String(String),
     Boolean(bool),
     Char(char),
     Byte(u8),
     Identifier(String),
     Pointer(String),
     Array(Vec<Value>),
 }
+
+impl Value {
+    /// Returns the WIRType of this value
+    pub fn get_type(&self) -> WIRType {
+        match self {
+            Value::Integer(_) => WIRType::I32,
+            Value::Unsigned(_) => WIRType::U32,
+            Value::Float(_) => WIRType::F32,
+            Value::String(_) => WIRType::Str,
+            Value::Boolean(_) => WIRType::Bool,
+            Value::Char(_) => WIRType::Char,
+            Value::Byte(_) => WIRType::Byte,
+            Value::Identifier(_) => WIRType::Str, // Simplified, might need a separate type
+            Value::Pointer(_) => WIRType::Ptr(Box::new(WIRType::U32)), // Default to pointer to u32
+            Value::Array(elements) => {
+                if let Some(first) = elements.first() {
+                    WIRType::Array(Box::new(first.get_type()))
+                } else {
+                    // Empty array, default to Array of I32
+                    WIRType::Array(Box::new(WIRType::I32))
+                }
+            }
+        }
+    }
+}

14-26: Consider implementing PartialEq and Eq for comparison operations

To enable value comparison, consider implementing PartialEq and Eq traits.

-#[derive(Debug, Clone)]
+#[derive(Debug, Clone, PartialEq, Eq)]
 pub enum Value {
     Integer(i32),
     Unsigned(u32),
     Float(f32),
     String(String),
     Boolean(bool),
     Char(char),
     Byte(u8),
     Identifier(String),
     Pointer(String),
     Array(Vec<Value>),
 }

Note: Implementing Eq might require special handling for Float values which cannot be directly compared using Eq due to NaN values. You may need custom implementations for these traits.

src/dummy.rs (3)

7-7: Translate comment to English for consistency

The comment is in Korean. For international collaboration, it would be better to use English comments throughout the codebase.

-    // 나중에: BinaryOp, FunctionCall 등 추가
+    // TODO: Add BinaryOp, FunctionCall, etc. in the future

1-19: Add documentation and consider renaming the module

The name "dummy" suggests this is temporary code. If this is intended for production use, consider renaming the module to something more descriptive (e.g., "ast") and adding appropriate documentation.

+/// Abstract Syntax Tree (AST) structures for representing program code
+/// before conversion to WIR.
+/// 
+/// Note: This implementation is currently simplified and will be expanded
+/// with more expression and statement types in the future.
 #[derive(Debug)]
 pub enum Expr {
     Integer(i32),
     String(String),
     Bool(bool),
     Identifier(String),
     // TODO: Add BinaryOp, FunctionCall, etc. in the future
 }

+/// Represents a statement in the AST
 #[derive(Debug)]
 pub enum Statement {
+    /// Print statement (e.g., print "Hello")
     Print(String),
+    /// Assignment statement (e.g., x = 42)
     Assign(String, Expr),
 }

+/// The root structure of the Abstract Syntax Tree
 #[derive(Debug)]
 pub struct AST {
+    /// List of statements in the program
     pub statements: Vec<Statement>,
 }

17-19: Add constructor method for AST

Consider adding a constructor method for better ergonomics when creating new AST instances.

 #[derive(Debug)]
 pub struct AST {
     pub statements: Vec<Statement>,
 }
+
+impl AST {
+    /// Creates a new empty AST
+    pub fn new() -> Self {
+        Self { statements: vec![] }
+    }
+    
+    /// Creates an AST with the given statements
+    pub fn with_statements(statements: Vec<Statement>) -> Self {
+        Self { statements }
+    }
+    
+    /// Adds a statement to the AST
+    pub fn add_statement(&mut self, statement: Statement) {
+        self.statements.push(statement);
+    }
+}
src/wir/builder.rs (3)

12-15: Implement format string support for Print instruction

The current Print instruction doesn't support format specifiers. Consider implementing basic format string support.

                 let instr = Instruction::Print {
                     format: text,
-                    args: vec![], // {} format factor can be added later
+                    args: vec![], // Format arguments for {} placeholders
                 };

This would require additional changes to parse format strings and match arguments, which could be implemented in a follow-up PR.


11-11: Consider translating comments to English

For consistency with the rest of the codebase, consider using English for all comments.


20-20: Consider translating comments to English

For consistency with the rest of the codebase, consider using English for all comments.

src/wir/mod.rs (2)

7-9: Consider restricting direct field access.

Exposing instructions as a public field can allow external modifications without control. Hiding it behind methods can improve encapsulation.

 pub struct WIRModule {
-    pub instructions: Vec<Instruction>,
+    instructions: Vec<Instruction>,
 }

11-25: Return slices instead of &Vec<Instruction>.

Returning &[Instruction] is more idiomatic and prevents accidental reliance on Vec-specific APIs, improving future flexibility.

 pub fn get(&self) -> &Vec<Instruction> {
-    &self.instructions
+    &self.instructions
 }

+pub fn instructions(&self) -> &[Instruction] {
+    &self.instructions
+}
src/wir/instruction.rs (2)

21-26: Fix indentation in the For loop variant.

There's inconsistent indentation in the For variant - line 22 has more indentation than the following lines.

 For {
-      init: Box<Instruction>,
+        init: Box<Instruction>,
         condition: Value,
         step: Box<Instruction>,
         body: Vec<Instruction>,
     },

3-32: Consider adding documentation and missing instruction types.

While the current implementation covers basic control flow and operations, consider:

  1. Adding rustdoc comments to document each variant's purpose and usage
  2. Adding missing instruction types that might be needed:
    • Comparison operations (Equal, Less, Greater, etc.)
    • Logical operations (AND, OR, NOT)
    • Bitwise operations
    • Memory access operations beyond Store

This would make the instruction set more complete for the WIR implementation.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a548e57 and 0c3967e.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • Cargo.toml (1 hunks)
  • bin/main.rs (1 hunks)
  • src/codegen/mod.rs (1 hunks)
  • src/codegen/x86_64.rs (1 hunks)
  • src/dummy.rs (1 hunks)
  • src/elf/mod.rs (1 hunks)
  • src/elf/writer.rs (1 hunks)
  • src/lib.rs (1 hunks)
  • src/main.rs (0 hunks)
  • src/wir/builder.rs (1 hunks)
  • src/wir/instruction.rs (1 hunks)
  • src/wir/mod.rs (1 hunks)
  • src/wir/value.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • src/main.rs
🔇 Additional comments (11)
Cargo.toml (1)

2-2: Package name now follows Rust naming conventions

Good update. Changing the package name from "Whale" to "whale" follows the Rust convention of using lowercase names for packages.

src/elf/mod.rs (1)

1-1: Appropriate module visibility for library architecture

Making the writer module pub(crate) is a good choice - it allows the module to be used by other parts of the crate while still keeping it as an internal implementation detail that's not exposed to users of the library.

src/codegen/mod.rs (1)

1-1: Appropriate module visibility for library architecture

Making the x86_64 module pub(crate) is a good choice - it allows the module to be used by other parts of the crate (like the new compile_from_ast function) while still keeping it as an internal implementation detail that's not exposed to users of the library.

src/elf/writer.rs (1)

30-31: Good to have: Provide informative output post-linking.

Printing a success message is helpful for debugging. You could also print the full path to the generated file or additional metadata (e.g., file size).

src/wir/mod.rs (1)

1-3: Visibility of submodules is appropriate for modular design.

Declaring pub mod instruction;, pub mod value;, and pub mod builder; makes it clear what is exposed at the WIR level.

src/wir/instruction.rs (6)

1-2: Appropriate import for Value type.

The import of the Value type is correctly added to support the new instruction variants. This aligns with the project's refactoring to a more functional style and the implementation of the WIR module.


4-7: Print instruction implementation looks good.

The Print variant with format string and arguments vector provides a solid foundation for string output capabilities. This matches the PR objective of implementing string literal handling for ELF binaries.


8-11: Store instruction is properly implemented.

The Store instruction provides a clean way to associate values with named variables in the intermediate representation.


12-16: If statement implementation matches control flow needs.

The If variant is well-structured with condition, then block, and optional else block components, providing complete conditional logic representation.


17-20: While loop implementation is straightforward and effective.

The While variant appropriately captures the two essential components of a while loop: the condition and the body.


27-27: Return instruction is appropriately implemented.

The Return variant is correctly defined to take a Value parameter, allowing functions to return values as needed.

Comment on lines +9 to +13
pub fn compile_from_ast(ast: AST) {
let wir = wir::builder::build(ast);
let asm = codegen::x86_64::generate(&wir);
elf::writer::write_elf(&asm, "output");
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling

The current implementation doesn't handle potential errors during compilation. Consider returning a Result type to enable proper error propagation.

-pub fn compile_from_ast(ast: AST) {
+pub fn compile_from_ast(ast: AST, output_path: &str) -> Result<(), std::io::Error> {
    let wir = wir::builder::build(ast);
    let asm = codegen::x86_64::generate(&wir);
-    elf::writer::write_elf(&asm, "output");
+    elf::writer::write_elf(&asm, output_path)?;
+    Ok(())
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn compile_from_ast(ast: AST) {
let wir = wir::builder::build(ast);
let asm = codegen::x86_64::generate(&wir);
elf::writer::write_elf(&asm, "output");
}
pub fn compile_from_ast(ast: AST, output_path: &str) -> Result<(), std::io::Error> {
let wir = wir::builder::build(ast);
let asm = codegen::x86_64::generate(&wir);
elf::writer::write_elf(&asm, output_path)?;
Ok(())
}

Comment on lines 5 to 39
pub fn build(ast: AST) -> WIRModule {
let mut module = WIRModule::new();

for stmt in ast.statements {
match stmt {
Statement::Print(text) => {
// println("Hello World") → Print { format: "...", args: [] }
let instr = Instruction::Print {
format: text,
args: vec![], // {} format factor can be added later
};
module.push(instr);
}

Statement::Assign(name, val) => {
// Ex: var a = 10; → Store { name: a, value: ... }
let value = match val {
dummy::Expr::Integer(i) => Value::Integer(i),
dummy::Expr::String(s) => Value::String(s),
dummy::Expr::Bool(b) => Value::Boolean(b),
dummy::Expr::Identifier(id) => Value::Identifier(id),
};

module.push(Instruction::Store {
name,
value,
});
}

// If, When, For, Return, etc. can be added later
}
}
}

module
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for the build function

The current implementation doesn't handle potential errors during the build process. Consider adding error handling to make the code more robust.

+use std::fmt;
 use crate::dummy;
 use crate::dummy::{AST, Statement};
 use super::{instruction::Instruction, value::Value, WIRModule};

+/// Error type for the build process
+#[derive(Debug)]
+pub enum BuildError {
+    UnsupportedExpr(String),
+    UnsupportedStatement(String),
+}
+
+impl fmt::Display for BuildError {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        match self {
+            BuildError::UnsupportedExpr(msg) => write!(f, "Unsupported expression: {}", msg),
+            BuildError::UnsupportedStatement(msg) => write!(f, "Unsupported statement: {}", msg),
+        }
+    }
+}
+
+impl std::error::Error for BuildError {}

-pub fn build(ast: AST) -> WIRModule {
+/// Builds a WIRModule from an AST
+/// 
+/// # Arguments
+/// 
+/// * `ast` - The AST to build from
+/// 
+/// # Returns
+/// 
+/// A Result containing the built WIRModule or a BuildError
+pub fn build(ast: AST) -> Result<WIRModule, BuildError> {
     let mut module = WIRModule::new();

     for stmt in ast.statements {
         match stmt {
             Statement::Print(text) => {
                 // println("Hello World") → Print { format: "...", args: [] }
                 let instr = Instruction::Print {
                     format: text,
                     args: vec![], // {} format factor can be added later
                 };
                 module.push(instr);
             }

             Statement::Assign(name, val) => {
                 // Ex: var a = 10; → Store { name: a, value: ... }
                 let value = match val {
                     dummy::Expr::Integer(i) => Value::Integer(i),
                     dummy::Expr::String(s) => Value::String(s),
                     dummy::Expr::Bool(b) => Value::Boolean(b),
                     dummy::Expr::Identifier(id) => Value::Identifier(id),
+                    _ => return Err(BuildError::UnsupportedExpr(format!("{:?}", val))),
                 };

                 module.push(Instruction::Store {
                     name,
                     value,
                 });
             }

+            _ => return Err(BuildError::UnsupportedStatement(format!("{:?}", stmt))),
             // If, When, For, Return, etc. can be added later
         }
    }

-    module
+    Ok(module)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn build(ast: AST) -> WIRModule {
let mut module = WIRModule::new();
for stmt in ast.statements {
match stmt {
Statement::Print(text) => {
// println("Hello World") → Print { format: "...", args: [] }
let instr = Instruction::Print {
format: text,
args: vec![], // {} format factor can be added later
};
module.push(instr);
}
Statement::Assign(name, val) => {
// Ex: var a = 10; → Store { name: a, value: ... }
let value = match val {
dummy::Expr::Integer(i) => Value::Integer(i),
dummy::Expr::String(s) => Value::String(s),
dummy::Expr::Bool(b) => Value::Boolean(b),
dummy::Expr::Identifier(id) => Value::Identifier(id),
};
module.push(Instruction::Store {
name,
value,
});
}
// If, When, For, Return, etc. can be added later
}
}
}
\ No newline at end of file
module
}
use std::fmt;
use crate::dummy;
use crate::dummy::{AST, Statement};
use super::{instruction::Instruction, value::Value, WIRModule};
/// Error type for the build process
#[derive(Debug)]
pub enum BuildError {
UnsupportedExpr(String),
UnsupportedStatement(String),
}
impl fmt::Display for BuildError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
BuildError::UnsupportedExpr(msg) => write!(f, "Unsupported expression: {}", msg),
BuildError::UnsupportedStatement(msg) => write!(f, "Unsupported statement: {}", msg),
}
}
}
impl std::error::Error for BuildError {}
/// Builds a WIRModule from an AST
///
/// # Arguments
///
/// * `ast` - The AST to build from
///
/// # Returns
///
/// A Result containing the built WIRModule or a BuildError
pub fn build(ast: AST) -> Result<WIRModule, BuildError> {
let mut module = WIRModule::new();
for stmt in ast.statements {
match stmt {
Statement::Print(text) => {
// println("Hello World") → Print { format: "...", args: [] }
let instr = Instruction::Print {
format: text,
args: vec![], // {} format factor can be added later
};
module.push(instr);
}
Statement::Assign(name, val) => {
// Ex: var a = 10; → Store { name: a, value: ... }
let value = match val {
dummy::Expr::Integer(i) => Value::Integer(i),
dummy::Expr::String(s) => Value::String(s),
dummy::Expr::Bool(b) => Value::Boolean(b),
dummy::Expr::Identifier(id) => Value::Identifier(id),
_ => return Err(BuildError::UnsupportedExpr(format!("{:?}", val))),
};
module.push(Instruction::Store {
name,
value,
});
}
_ => return Err(BuildError::UnsupportedStatement(format!("{:?}", stmt))),
// If, When, For, Return, etc. can be added later
}
}
Ok(module)
}

Comment on lines 9 to 29
// Step 3: Create ELF executable (ld)
let ld_status = Command::new("ld")
.args(["-o", output_name, "temp.o"])
.status()
.expect("Failed to run ld");

if !ld_status.success() {
panic!("ld failed");
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Return a Result instead of panicking on linking failures.

Similar to the nasm step, consider forwarding an error to the caller to maintain a non-panicking workflow in library code.

 let ld_status = Command::new("ld")
     .args(["-o", output_name, "temp.o"])
     .status()
     .expect("Failed to run ld");

 if !ld_status.success() {
-    panic!("ld failed");
+    return Err("ld failed".into());
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Step 3: Create ELF executable (ld)
let ld_status = Command::new("ld")
.args(["-o", output_name, "temp.o"])
.status()
.expect("Failed to run ld");
if !ld_status.success() {
panic!("ld failed");
}
}
\ No newline at end of file
// Step 3: Create ELF executable (ld)
let ld_status = Command::new("ld")
.args(["-o", output_name, "temp.o"])
.status()
.expect("Failed to run ld");
if !ld_status.success() {
return Err("ld failed".into());
}

Comment on lines +10 to 19
// Step 2: Create .o file (nasm)
let nasm_status = Command::new("nasm")
.args(["-felf64", "temp.asm", "-o", "temp.o"])
.status()
.expect("Failed to run nasm");

if !nasm_status.success() {
panic!("nasm failed");
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid panicking when nasm fails.

Since this is a library function, returning a Result instead of panicking would provide more graceful error handling and allow callers to manage failures appropriately.

 let nasm_status = Command::new("nasm")
     .args(["-felf64", &temp_asm_path, "-o", "temp.o"])
     .status()
     .expect("Failed to run nasm");

 if !nasm_status.success() {
-    panic!("nasm failed");
+    return Err("nasm failed".into());
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Step 2: Create .o file (nasm)
let nasm_status = Command::new("nasm")
.args(["-felf64", "temp.asm", "-o", "temp.o"])
.status()
.expect("Failed to run nasm");
if !nasm_status.success() {
panic!("nasm failed");
}
// Step 2: Create .o file (nasm)
let nasm_status = Command::new("nasm")
.args(["-felf64", &temp_asm_path, "-o", "temp.o"])
.status()
.expect("Failed to run nasm");
if !nasm_status.success() {
return Err("nasm failed".into());
}

Comment on lines +5 to +9
pub fn write_elf(asm: &str, output_name: &str) {
// Step 1: Save .asm files
let mut asm_file = File::create("temp.asm").expect("Failed to create temp.asm");
asm_file.write_all(asm.as_bytes()).expect("Failed to write asm");

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using unique temporary filenames for concurrency safety.

In a library context, multiple processes or threads might concurrently run write_elf and attempt to write to the same "temp.asm" path. You could leverage a dedicated temporary directory or random file naming (e.g., using the tempfile crate in Rust) to avoid collisions.

 pub fn write_elf(asm: &str, output_name: &str) {
-    let mut asm_file = File::create("temp.asm").expect("Failed to create temp.asm");
+    let temp_asm_path = format!("temp_{}.asm", uuid::Uuid::new_v4());
+    let mut asm_file = File::create(&temp_asm_path).expect("Failed to create temporary file");
     asm_file.write_all(asm.as_bytes()).expect("Failed to write asm");
     ...
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn write_elf(asm: &str, output_name: &str) {
// Step 1: Save .asm files
let mut asm_file = File::create("temp.asm").expect("Failed to create temp.asm");
asm_file.write_all(asm.as_bytes()).expect("Failed to write asm");
pub fn write_elf(asm: &str, output_name: &str) {
// Step 1: Save .asm files
let temp_asm_path = format!("temp_{}.asm", uuid::Uuid::new_v4());
let mut asm_file = File::create(&temp_asm_path).expect("Failed to create temporary file");
asm_file.write_all(asm.as_bytes()).expect("Failed to write asm");
...
}

Comment on lines 6 to 32
pub fn generate(module: &WIRModule) -> String {
let mut asm = String::new();

// Default ELF Text Section
asm.push_str("section .text\n");
asm.push_str("global _start\n\n");
asm.push_str("_start:\n");

pub fn generate(&self) {
println!("Generating x86_64 code...");
for instr in module.get() {
match instr {
Instruction::Print { format, args } => {
asm.push_str(&generate_print(format, args));
}
Instruction::Store { name, value } => {
// For testing: Store first annotate
asm.push_str(&format!(" ; Store {} = {:?}\n", name, value));
}
_ => {
asm.push_str(" ; [UNIMPLEMENTED INSTRUCTION]\n");
}
}
}
}

// Ending the program
asm.push_str(" mov rax, 60\n"); // syscall: exit
asm.push_str(" xor rdi, rdi\n"); // status 0
asm.push_str(" syscall\n");

asm
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure .data section is declared when referencing string labels.

The generated code references msg and msg_len, but they must be declared in the .data section. Without these definitions, the assembled output may fail to link or run properly.

Comment on lines +34 to +45
fn generate_print(format: &str, _args: &Vec<crate::wir::value::Value>) -> String {
let mut code = String::new();

// Simply insert a string literal into the .data section and output it as syscall
code.push_str(" mov rax, 1\n"); // syscall: write
code.push_str(" mov rdi, 1\n"); // fd: stdout
code.push_str(" mov rsi, msg\n"); // msg address
code.push_str(" mov rdx, msg_len\n"); // msg length
code.push_str(" syscall\n\n");

code
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use the format parameter.

Currently, format is not utilized, and the code always prints the same message. Consider supporting dynamic strings or multiple placeholders to leverage this parameter fully.

 fn generate_print(format: &str, _args: &Vec<crate::wir::value::Value>) -> String {
     let mut code = String::new();
     // e.g., copy 'format' content into a unique label in some .data section
-    code.push_str("  mov rax, 1\n");
-    code.push_str("  mov rdi, 1\n");
-    code.push_str("  mov rsi, msg\n");
-    code.push_str("  mov rdx, msg_len\n");
-    code.push_str("  syscall\n\n");
+    code.push_str(&format!(
+        "  mov rax, 1\n  mov rdi, 1\n  mov rsi, {label}\n  mov rdx, {label}_len\n  syscall\n\n",
+        label = "format_label"
+    ));
     code
 }

Committable suggestion skipped: line range outside the PR's diff.

@LunaStev LunaStev merged commit e969034 into master Mar 29, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants