This repository contains tools for predicting ChampSim simulation performance using early simulation data.
random_runner.py: Script to generate random ChampSim configurations and SLURM job scripts.parser.py: Main script to parse ChampSim output JSON and logs into a single pickle dataset (per benchmark).train.py: Script to train the ExtraTrees regression model on the parsed dataset and save the trained model.predict.py: Script to generate predictions using trained models.tests/evaluate_model.py: Script to evaluate model performance with various holdout strategies.mappers.json: Defines mappings from database indices to ChampSim configuration strings.config.json: Top-level configuration for simulation parameters and paths.
- Clone the repository:
git clone --recursive https://github.com/stickneyaiden/EarlyPerf.git # If already cloned: # git submodule update --init --recursive
- Build Submodules:
- ChampSim:
cd ChampSim git submodule update --init # Setup vcpkg submodule ./vcpkg/bootstrap-vcpkg.sh ./vcpkg/vcpkg install # To verify build (optional): # ./config.sh champsim_config.json # make cd ..
- CACTI:
cd cacti make cd ..
- ChampSim:
- Install Dependencies:
pip install -r requirements.txt
- Environment: Ensure you have Python 3.8+ loaded.
- Configuration:
- Copy
config.json.templatetoconfig.jsonand edit it with your specific paths and account information.cp config.json.template config.json # Edit config.json - Copy
scripts/slurm_template.txt.templatetoscripts/slurm_template.txtand customize it for your SLURM environment.cp scripts/slurm_template.txt.template scripts/slurm_template.txt # Edit scripts/slurm_template.txt
- Copy
Note: By using the provided pretrained models in the models/ directory, you can skip directly to the Prediction step.
To generate new training data by running simulations:
- Configure the simulation parameters in
config.json. - Modify the SLURM job template in
scripts/slurm_template.txtto match your environment. - Run the generation script:
python3 scripts/random_runner.py
This will compile random configurations and create bash scripts to be submitted to SLURM. Simulation outputs (json and logs) will be stored in output/<SIMULATION_RUN>/.
Submit the generated SLURM scripts located in batch/<SIMULATION_RUN>/.
Parse raw simulation outputs (JSON/Logs) into a structured dataset (pickle).
python3 parser.pyThis looks for data in output/<SIMULATION_RUN>/json/output/<SIMULATION_RUN>/logs and saves to data/aggregated_output/.
Train performance prediction models.
python3 train.py
# Optionally limit training data to NUM_CONFIGS defined in config.json:
# python3 train.py --limit-dataThe script uses the parsed data for the specified simulation run (SIMULATION_RUN) with the defined PREVIEW_SIM_POINTS.
Models are saved to the models/ directory.
Run predictions using predict.py (you must have trained models first).
python3 predict.py --trace <path_to_trace> --binary <path_to_champsim_binary> --config <path_to_champsim_config>Options:
--models_dir: Directory containing trained models (default:models/).--model_name: Specific model file to use (optional).--compare: If set, runs a full simulation to compare actual IPC.--compare_instructions: Number of instructions for the full comparison run (default: 700 Million).
The script uses PREVIEW_SIM_POINTS and TOTAL_INSTRUCTIONS from config.json to determine the number of instructions for the preview run.
Use the tools in tests/ to evaluate model performance. These tools do not use pretrained models and instead rely on the parsed dataset.
- General Evaluation (hold out a random configuration):
python3 tests/evaluate_model.py --pickle <path_to_pickle> --mode random
- Index Holdout (e.g., hold out configuration at index 5):
python3 tests/evaluate_model.py --pickle <path_to_pickle> --mode index --index 5
- Group Holdout (e.g., hold out all
tage_sc_lbranch predictors):python3 tests/evaluate_model.py \ --pickle <path_to_pickle> \ --mode group \ --feature BranchPredictor \ --value tage_sc_l
- K-Fold Cross Validation (e.g., 3-fold):
python3 tests/evaluate_model.py --pickle <path_to_pickle> --mode kfold --k 3
- Leave-One-Out Cross Validation:
python3 tests/evaluate_model.py --pickle <path_to_pickle> --mode loo
Options:
--pickle: Path to the pickle file containing parsed data.--mode: Holdout mode:random,index,group,kfold, orloo.--index: Index of configuration to hold out (for mode=index).--feature: Feature name to hold out (for mode=group).--value: Feature value to hold out (for mode=group).--mappers: Path tomappers.json(required for mode=group).--duration: Duration of simulation samples/phases to use (preview).--k: Number of folds for k-fold cross validation (for mode=kfold).
This tool does not use config.json and requires the user to specify the pickle file and other parameters directly.
- ChampSim: This tool expects ChampSim simulation outputs. The
ChampSimdirectory is currently excluded from git via.gitignore. - CACTI: Used for power estimation (referenced in configuration).