This repository contains the dataset of selected functions used in the research paper Does Representation Matter? Evaluating IRs for LLM-based Binary Decompilation.
The selected_functions/selected.json contains all the functions used in our evaluation. It includes the verified paths and function signatures derived from the ExeBench dataset (test real split).
@inproceedings{bar2026-does-representation-matter,
title = {{Does Representation Matter? Evaluating IRs for LLM-based Binary Decompilation}},
author = {Pelayo-Benedet, Tomás and Borgolte, Kevin and Rodríguez, Ricardo J.},
booktitle = {Proceedings of the 9th Workshop on Binary Analysis Research (BAR)},
date = {2026-02-27},
editor = {Bardin, Sébastien and Hauser, Christophe},
location = {San Diego, CA, USA},
publisher = {Internet Society (ISOC)},
volume = {9}
}
This research was supported in part by the Agencia Estatal de Investigación (AEI, Spanish State Research Agency) and the European Regional Development Fund (ERDF) of the European Union (EU) under grant PID2023-151467OA-I00 (CRAPER), by AEI and the EU Next Generation EU fund (NGEU) via the Recovery, Transformation and Resilience Plan (PRTR) under grant TED2021-131115A-I00 (MIMFA), by the Spanish National Cybersecurity Institute (INCIBE) and NGEU via PRTR under grant Proyecto Estratégico Ciberseguridad EINA UNIZAR, by the Dirección General de Promoción Industrial e Innovación of the Gobierno de Aragón (University, Industry and Innovation Department of the Aragonese Government) under grant Programa de Proyectos Estratégicos de Grupos de Investigación (DisCo research group, ref. T21-23R), and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2092 CASA – 390781972.
Any opinions, findings, and conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of the respective funding agencies.