Skip to content

ASU-VDA-Lab/CHICO-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CHICO-Agent: An LLM Agent for the Cross-layer Optimization of 2.5D and 3D Chiplet-based Systems

CHICO-Agent is an LLM-driven optimization framework that leverages heterogeneous integration (HI) co-design across the full design hierarchy—application, architecture, chiplet, and packaging. It replaces the stochastic perturbation mechanisms of traditional metaheuristics (e.g., simulated annealing) with a reasoning-driven optimization loop for cross-layer design space exploration (DSE). CHICO-Agent maintains a persistent knowledge base to capture parameter–outcome trends and coordinates exploration through a hierarchical admin–field multi-agent workflow, enabling systematic evaluation of trade-offs in next-generation AI hardware.

Table of Contents

File structure

  • .codex - Repo based Codex configuration directory
  • chiplet - Core chiplet modeling modules
  • config - Configuration files (parameters, calibration, architecture examples, blacklists)
  • config.py - Global configuration
  • documentation - PPAC modeling reference documentation
  • main.py - Main entry point
  • Makefile - Build and run targets
  • script - Utility scripts for launching runs and calibration
  • system - DSE runners and simulation utilities

Getting started

Prerequisites

CHICO-Agent requires the following:

  • Python >= 3.9
  • pip >= 25.0
  • Codex CLI (OpenAI)

Additionally, please refer to the requirements.txt file in this repository. The packages in requirements.txt will be installed in a virtual environment.

Download and install with bash

git clone https://github.com/ASU-VDA-Lab/CHICO-Agent.git
cd CHICO-Agent
python3 -m venv path-ai
source path-ai/bin/activate
pip3 install -r requirements.txt

Codex CLI setup

CHICO-Agent uses OpenAI Codex CLI as the LLM agent backbone. The default model is gpt-5.3-codex.

1. Install Codex CLI

Follow the official Codex CLI installation guide to install the CLI tool. On most systems:

npm install -g @openai/codex

2. Set your OpenAI API key

Codex CLI requires an OpenAI API key. Export it in your shell:

export OPENAI_API_KEY="sk-..."

You can add this to your shell profile (e.g., ~/.bashrc, ~/.zshrc) for persistence.

Alternatively, you can also login with your OpenAI account with /login command.

3. Codex configuration

CHICO-Agent ships with a pre-configured .codex/config.toml that sets the model and sandbox permissions:

model = "gpt-5.3-codex"
model_reasoning_effort = "xhigh"
model_supports_reasoning_summaries = true
model_reasoning_summary = "detailed"

sandbox_mode = "workspace-write"
web_search = "disabled"

[sandbox_workspace_write]
writable_roots = ["/your/output/directory"]

Important: Update the writable_roots path in .codex/config.toml to point to your project's output directory before running.

The key configuration options are:

Option Description
model The OpenAI model to use. Default: gpt-5.3-codex
model_reasoning_effort Controls the model's reasoning depth: low, medium, high, xhigh
model_reasoning_summary Reasoning summary verbosity: detailed
sandbox_mode Sandbox permission mode. Must be workspace-write for CHICO-Agent
web_search Web search capability. Set to disabled for reproducibility
writable_roots Directories the agent is allowed to write to

Input parameters and configuration

Parameters

CHICO-Agent utilizes parameters and configurations stored in the config directory. The core parameters for the framework are located in the config/parameters folder. Below is a list of the various JSON files along with their details. The primary file that users need to modify is input.json, as it defines the overall search space for CHICO-Agent. In addition to this, several other parameter files are available, as listed below:

config/parameters/
├── base_spec_7nm.json       [Area and power for different systolic arrays at 7nm]
├── base_spec_sram_7nm.json  [Area and energy for different SRAM sizes at 7nm]
├── bonding_yield.json       [Bonding yield for different package types]
├── cost_profiles.json       [Weight factors for different optimization profiles]
├── d2d_input.json           [Die-to-die datarate, efficiency, pitch information]
├── energy_eff.json          [Die-to-die protocol and DRAM energy efficiency]
├── freq_scale.json          [Frequency scale across tech nodes]
├── input.json               [Main parameter file that defines the search space]
├── mem_bw.json              [Memory bandwidth for different systolic arrays]
├── scaling.json             [Area and power scaling for logic]
├── sram_energy.json         [SRAM energy values and SRAM energy scaling]
└── sram_scaling.json        [SRAM area and energy scaling]

Users can input different parameters of their choice by modifying the above parameter files. CHICO-Agent models the overall HI-system's area, power, energy, cost, and cycle-accurate latency using data from the above JSON files.

CHICO-Agent supports multiple optimization profiles: balance, mobile, wearables, automotive, and cost-optimized. Users can additionally customize the weight of various metrics according to their preferences.

Calibration

Since the metrics used by CHICO-Agent have different units and scales, normalization is necessary to prevent any single term from dominating the cost function. The files below contain the calibrated data for six different workloads. Calibration was performed on 10,000 samples, and both the median and minimum values were used.

config/calibration/
├── calibration_1.json
├── calibration_2.json
├── calibration_3.json
├── calibration_4.json
├── calibration_5.json
└── calibration_6.json

Command to run calibration on a particular workload is shown below, by default it runs for 10,000 samples.

make calibration WORKLOAD=5

NOTE: Please ensure the old config/calibration_*.json is deleted prior to running new calibration.

Simulation cache

CHICO-Agent computes cycle-accurate latency for the AI workloads it runs, which can be time-intensive. To address this, we implemented a lookup table–based simulation cache that dynamically stores key parameters such as systolic array size, workload shape, memory bandwidth, SRAM size, data flow, and the computed cycle count. During the optimization, the simulator is invoked only if a cache miss occurs (i.e., a configuration has not been encountered before). This approach significantly speeds up the computation. Additionally, the simulation cache is configured to automatically update on a miss, enabling faster execution for subsequent runs.

Architecture blacklist

Not all parameter combinations produce physically realizable systems. The file config/config_blacklist.json enumerates illegal combinations, including mismatched protocol assignments, unstable 3D stacks, and incorrect HI classifications. CHICO-Agent validates every proposed configuration against this blacklist before evaluation.

Running CHICO-Agent

There are multiple ways CHICO-Agent can be launched. CHICO-Agent does an extensive design space exploration, and since the search space is vast, run times vary based on the workload.

Include new GEMM workload

To run on a new GEMM workload update the workload.json and use the provided workload number in the commands below. The workload.json is in M, K, N format as shown below:

"1": [128, 2048, 1000],

Here for workload 1, we have M=128, K=2048, and N=1000.

Evaluate a single architecture

make cost ARCH_FILE=output/wl1_mobile/test/arch_1.json \
          WORKLOAD=1 \
          COST_PROFILE=mobile \
          RESULT_CSV=output/wl1_mobile/result_wl1_mobile.csv

Evaluate a batch of architectures

make cost_batch ARCH_DIR=output/wl1_mobile/test/ \
                WORKLOAD=1 \
                COST_PROFILE=mobile \
                RESULT_CSV=output/wl1_mobile/result_wl1_mobile.csv

Run calibration

make calibration WORKLOAD=5

make cost arguments

Variable Default Description
ARCH_FILE config/gen_arch/temp_example/3d_example.json Architecture JSON to evaluate
WORKLOAD 1 Workload index (1–6)
COST_PROFILE mobile Profile from config/parameters/cost_profiles.json
RESULT_CSV unset Optional append target path for result CSV
RUN_NAME generated if omitted Optional simulation/cache run label

Run calibration in parallel for all workloads

To delete old calibration files for all workloads:

python script/clean_calibration.py

To launch calibration in parallel for all workloads, update the RUN_NAME and WORKLOADS in run_parallel_calibration.sh and launch:

./script/run_parallel_calibration.sh

Launch CHICO-Agent via the Codex CLI

CHICO-Agent uses the Codex CLI to drive the LLM agent loop. The launcher script run_codex.sh orchestrates the full optimization workflow—creating output directories, starting new Codex sessions, and resuming sessions across iterations.

Single workload run

To run CHICO-Agent on a single workload with a specific cost profile:

./script/run_codex.sh <workload> <cost_profile> <max_iterations>

For example, to optimize workload 1 with the mobile profile for 10 iterations:

./script/run_codex.sh 1 mobile 10
Parallel runs across workloads and profiles

To launch CHICO-Agent in parallel across all 6 workloads for a specific cost profile:

./script/run_codex.sh --parallel <max_iterations> <cost_profile>

For example, to run all workloads with the automotive profile for 5 iterations:

./script/run_codex.sh --parallel 5 automotive

To sweep all workloads across all cost profiles (mobile, balance, automotive, wearables):

./script/run_codex.sh --parallel <max_iterations>
Environment variables

The launcher script supports several environment variables for customization:

Variable Default Description
CODEX_MODEL gpt-5.3-codex OpenAI model to use
CODEX_MODEL_REASONING_EFFORTS_INDEX 1 Index into reasoning effort levels (1=low, 2=medium, 3=high, 4=xhigh)
CODEX_COOLDOWN 10 Seconds to wait between iterations
OUTPUT_DIR output Base output directory
CODEX_SESSIONS_DIR ~/.codex/sessions Directory where Codex stores session files
STAGGER_BASE 10 Seconds between parallel job launches (to avoid API rate limits)
How the Codex agent loop works

Each iteration of the CHICO-Agent loop proceeds as follows:

  1. First iteration: The script runs codex exec with an initial prompt (e.g., "Find the minimum cost for workload 1 with cost profile mobile"). Codex spawns a new session and the agent begins exploring the design space.

  2. Subsequent iterations: The script builds a structured state-injection prompt from the current KNOWHOW.md, BEST.csv, and result CSV, then resumes the existing Codex session with codex exec <prompt> resume <session_id>. This provides the agent with full context of prior exploration.

  3. Session persistence: The Codex session ID is saved to SESSION_ID in the output directory, allowing the loop to resume across script restarts.

The launcher script creates run directories following the pattern:

output/wl{N}_{profile}_{I}itr_m{M}_re{RE}/

where {N} is the workload index, {profile} is the cost profile, {I} is the max iterations, {M} is the model index, and {RE} is the reasoning effort index.

Other make targets

Target Description
make calibration Builds normalization baselines for a workload/profile
make validate Validates architecture JSONs against the blacklist

Outputs

Once CHICO-Agent completes, it creates a folder based on workload under the output directory.

It contains the BEST.csv, KNOWHOW.md, and result CSV files that are generated along with the runs. Below is an example of data that is generated for wl1 mobile optimization profile case:

output/wl1_mobile_5itr_m0_re1/
├── BEST.csv
├── KNOWHOW.md
├── result_wl1_mobile.csv
└── test/
    ├── arch_1.json
    ├── arch_2.json
    └── ...
BEST.csv

Tracks the global minimum cost across iterations. Each row records the iteration number, elapsed time, cumulative architectures evaluated, and the full PPAC metrics of the best configuration found so far.

KNOWHOW.md

An append-only knowledge base that accumulates cross-layer insights across iterations. Each entry includes what was tried, top results, what was learned, parameter–outcome quantification, and the next iteration plan.

Result CSV (result_wl{N}_{profile}.csv)

Each row in the result CSV includes:

  • arch_file — evaluated architecture JSON path
  • cost — weighted normalized optimization score (lower is better)
  • power, area, dollar, latency, energy — raw PPAC metrics
  • norm_energy, norm_area, norm_latency, norm_cost — normalized fields
Architecture files (test/arch_*.json)

These files capture the candidate architectures generated by the agent. Each file is structured as:

  • Chiplet information (tech node, systolic array size, SRAM buffer, area, power)
  • Package information (HI package type, connection topology, protocols, memory)
  • Workload mapping (dataflow, assignment order, split-K, data sharing)

Different example arch.json files for multiple HI-types are shown in config/gen_arch/temp_example directory for multiple chiplet numbers.

Citation

If you find CHICO-Agent useful or relevant to your research, please kindly cite our paper:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors