CHICO-Agent is an LLM-driven optimization framework that leverages heterogeneous integration (HI) co-design across the full design hierarchy—application, architecture, chiplet, and packaging. It replaces the stochastic perturbation mechanisms of traditional metaheuristics (e.g., simulated annealing) with a reasoning-driven optimization loop for cross-layer design space exploration (DSE). CHICO-Agent maintains a persistent knowledge base to capture parameter–outcome trends and coordinates exploration through a hierarchical admin–field multi-agent workflow, enabling systematic evaluation of trade-offs in next-generation AI hardware.
- .codex - Repo based Codex configuration directory
- chiplet - Core chiplet modeling modules
- config - Configuration files (parameters, calibration, architecture examples, blacklists)
- config.py - Global configuration
- documentation - PPAC modeling reference documentation
- main.py - Main entry point
- Makefile - Build and run targets
- script - Utility scripts for launching runs and calibration
- system - DSE runners and simulation utilities
CHICO-Agent requires the following:
- Python >= 3.9
- pip >= 25.0
- Codex CLI (OpenAI)
Additionally, please refer to the requirements.txt file in this repository. The packages in requirements.txt will be installed in a virtual environment.
git clone https://github.com/ASU-VDA-Lab/CHICO-Agent.git
cd CHICO-Agent
python3 -m venv path-ai
source path-ai/bin/activate
pip3 install -r requirements.txtCHICO-Agent uses OpenAI Codex CLI as the LLM agent backbone. The default model is gpt-5.3-codex.
Follow the official Codex CLI installation guide to install the CLI tool. On most systems:
npm install -g @openai/codexCodex CLI requires an OpenAI API key. Export it in your shell:
export OPENAI_API_KEY="sk-..."You can add this to your shell profile (e.g., ~/.bashrc, ~/.zshrc) for persistence.
Alternatively, you can also login with your OpenAI account with /login command.
CHICO-Agent ships with a pre-configured .codex/config.toml that sets the model and sandbox permissions:
model = "gpt-5.3-codex"
model_reasoning_effort = "xhigh"
model_supports_reasoning_summaries = true
model_reasoning_summary = "detailed"
sandbox_mode = "workspace-write"
web_search = "disabled"
[sandbox_workspace_write]
writable_roots = ["/your/output/directory"]Important: Update the writable_roots path in .codex/config.toml to point to your project's output directory before running.
The key configuration options are:
| Option | Description |
|---|---|
model |
The OpenAI model to use. Default: gpt-5.3-codex |
model_reasoning_effort |
Controls the model's reasoning depth: low, medium, high, xhigh |
model_reasoning_summary |
Reasoning summary verbosity: detailed |
sandbox_mode |
Sandbox permission mode. Must be workspace-write for CHICO-Agent |
web_search |
Web search capability. Set to disabled for reproducibility |
writable_roots |
Directories the agent is allowed to write to |
CHICO-Agent utilizes parameters and configurations stored in the config directory. The core parameters for the framework are located in the config/parameters folder. Below is a list of the various JSON files along with their details. The primary file that users need to modify is input.json, as it defines the overall search space for CHICO-Agent. In addition to this, several other parameter files are available, as listed below:
config/parameters/
├── base_spec_7nm.json [Area and power for different systolic arrays at 7nm]
├── base_spec_sram_7nm.json [Area and energy for different SRAM sizes at 7nm]
├── bonding_yield.json [Bonding yield for different package types]
├── cost_profiles.json [Weight factors for different optimization profiles]
├── d2d_input.json [Die-to-die datarate, efficiency, pitch information]
├── energy_eff.json [Die-to-die protocol and DRAM energy efficiency]
├── freq_scale.json [Frequency scale across tech nodes]
├── input.json [Main parameter file that defines the search space]
├── mem_bw.json [Memory bandwidth for different systolic arrays]
├── scaling.json [Area and power scaling for logic]
├── sram_energy.json [SRAM energy values and SRAM energy scaling]
└── sram_scaling.json [SRAM area and energy scaling]
Users can input different parameters of their choice by modifying the above parameter files. CHICO-Agent models the overall HI-system's area, power, energy, cost, and cycle-accurate latency using data from the above JSON files.
CHICO-Agent supports multiple optimization profiles: balance, mobile, wearables, automotive, and cost-optimized. Users can additionally customize the weight of various metrics according to their preferences.
Since the metrics used by CHICO-Agent have different units and scales, normalization is necessary to prevent any single term from dominating the cost function. The files below contain the calibrated data for six different workloads. Calibration was performed on 10,000 samples, and both the median and minimum values were used.
config/calibration/
├── calibration_1.json
├── calibration_2.json
├── calibration_3.json
├── calibration_4.json
├── calibration_5.json
└── calibration_6.json
Command to run calibration on a particular workload is shown below, by default it runs for 10,000 samples.
make calibration WORKLOAD=5NOTE: Please ensure the old config/calibration_*.json is deleted prior to running new calibration.
CHICO-Agent computes cycle-accurate latency for the AI workloads it runs, which can be time-intensive. To address this, we implemented a lookup table–based simulation cache that dynamically stores key parameters such as systolic array size, workload shape, memory bandwidth, SRAM size, data flow, and the computed cycle count. During the optimization, the simulator is invoked only if a cache miss occurs (i.e., a configuration has not been encountered before). This approach significantly speeds up the computation. Additionally, the simulation cache is configured to automatically update on a miss, enabling faster execution for subsequent runs.
Not all parameter combinations produce physically realizable systems. The file config/config_blacklist.json enumerates illegal combinations, including mismatched protocol assignments, unstable 3D stacks, and incorrect HI classifications. CHICO-Agent validates every proposed configuration against this blacklist before evaluation.
There are multiple ways CHICO-Agent can be launched. CHICO-Agent does an extensive design space exploration, and since the search space is vast, run times vary based on the workload.
To run on a new GEMM workload update the workload.json and use the provided workload number in the commands below. The workload.json is in M, K, N format as shown below:
"1": [128, 2048, 1000],
Here for workload 1, we have M=128, K=2048, and N=1000.
make cost ARCH_FILE=output/wl1_mobile/test/arch_1.json \
WORKLOAD=1 \
COST_PROFILE=mobile \
RESULT_CSV=output/wl1_mobile/result_wl1_mobile.csvmake cost_batch ARCH_DIR=output/wl1_mobile/test/ \
WORKLOAD=1 \
COST_PROFILE=mobile \
RESULT_CSV=output/wl1_mobile/result_wl1_mobile.csvmake calibration WORKLOAD=5| Variable | Default | Description |
|---|---|---|
ARCH_FILE |
config/gen_arch/temp_example/3d_example.json |
Architecture JSON to evaluate |
WORKLOAD |
1 |
Workload index (1–6) |
COST_PROFILE |
mobile |
Profile from config/parameters/cost_profiles.json |
RESULT_CSV |
unset | Optional append target path for result CSV |
RUN_NAME |
generated if omitted | Optional simulation/cache run label |
To delete old calibration files for all workloads:
python script/clean_calibration.pyTo launch calibration in parallel for all workloads, update the RUN_NAME and WORKLOADS in run_parallel_calibration.sh and launch:
./script/run_parallel_calibration.shCHICO-Agent uses the Codex CLI to drive the LLM agent loop. The launcher script run_codex.sh orchestrates the full optimization workflow—creating output directories, starting new Codex sessions, and resuming sessions across iterations.
To run CHICO-Agent on a single workload with a specific cost profile:
./script/run_codex.sh <workload> <cost_profile> <max_iterations>For example, to optimize workload 1 with the mobile profile for 10 iterations:
./script/run_codex.sh 1 mobile 10To launch CHICO-Agent in parallel across all 6 workloads for a specific cost profile:
./script/run_codex.sh --parallel <max_iterations> <cost_profile>For example, to run all workloads with the automotive profile for 5 iterations:
./script/run_codex.sh --parallel 5 automotiveTo sweep all workloads across all cost profiles (mobile, balance, automotive, wearables):
./script/run_codex.sh --parallel <max_iterations>The launcher script supports several environment variables for customization:
| Variable | Default | Description |
|---|---|---|
CODEX_MODEL |
gpt-5.3-codex |
OpenAI model to use |
CODEX_MODEL_REASONING_EFFORTS_INDEX |
1 |
Index into reasoning effort levels (1=low, 2=medium, 3=high, 4=xhigh) |
CODEX_COOLDOWN |
10 |
Seconds to wait between iterations |
OUTPUT_DIR |
output |
Base output directory |
CODEX_SESSIONS_DIR |
~/.codex/sessions |
Directory where Codex stores session files |
STAGGER_BASE |
10 |
Seconds between parallel job launches (to avoid API rate limits) |
Each iteration of the CHICO-Agent loop proceeds as follows:
-
First iteration: The script runs
codex execwith an initial prompt (e.g., "Find the minimum cost for workload 1 with cost profile mobile"). Codex spawns a new session and the agent begins exploring the design space. -
Subsequent iterations: The script builds a structured state-injection prompt from the current
KNOWHOW.md,BEST.csv, and result CSV, then resumes the existing Codex session withcodex exec <prompt> resume <session_id>. This provides the agent with full context of prior exploration. -
Session persistence: The Codex session ID is saved to
SESSION_IDin the output directory, allowing the loop to resume across script restarts.
The launcher script creates run directories following the pattern:
output/wl{N}_{profile}_{I}itr_m{M}_re{RE}/
where {N} is the workload index, {profile} is the cost profile, {I} is the max iterations, {M} is the model index, and {RE} is the reasoning effort index.
| Target | Description |
|---|---|
make calibration |
Builds normalization baselines for a workload/profile |
make validate |
Validates architecture JSONs against the blacklist |
Once CHICO-Agent completes, it creates a folder based on workload under the output directory.
It contains the BEST.csv, KNOWHOW.md, and result CSV files that are generated along with the runs. Below is an example of data that is generated for wl1 mobile optimization profile case:
output/wl1_mobile_5itr_m0_re1/
├── BEST.csv
├── KNOWHOW.md
├── result_wl1_mobile.csv
└── test/
├── arch_1.json
├── arch_2.json
└── ...
Tracks the global minimum cost across iterations. Each row records the iteration number, elapsed time, cumulative architectures evaluated, and the full PPAC metrics of the best configuration found so far.
An append-only knowledge base that accumulates cross-layer insights across iterations. Each entry includes what was tried, top results, what was learned, parameter–outcome quantification, and the next iteration plan.
Each row in the result CSV includes:
arch_file— evaluated architecture JSON pathcost— weighted normalized optimization score (lower is better)power,area,dollar,latency,energy— raw PPAC metricsnorm_energy,norm_area,norm_latency,norm_cost— normalized fields
These files capture the candidate architectures generated by the agent. Each file is structured as:
- Chiplet information (tech node, systolic array size, SRAM buffer, area, power)
- Package information (HI package type, connection topology, protocols, memory)
- Workload mapping (dataflow, assignment order, split-K, data sharing)
Different example arch.json files for multiple HI-types are shown in config/gen_arch/temp_example directory for multiple chiplet numbers.
If you find CHICO-Agent useful or relevant to your research, please kindly cite our paper: