Skip to content

AIESW-32720: aiebu enhancement to check for duplicate file names#291

Open
HimanshuChoudhary-Xilinx wants to merge 3 commits into
Xilinx:main-gefrom
HimanshuChoudhary-Xilinx:AIESW-32720
Open

AIESW-32720: aiebu enhancement to check for duplicate file names#291
HimanshuChoudhary-Xilinx wants to merge 3 commits into
Xilinx:main-gefrom
HimanshuChoudhary-Xilinx:AIESW-32720

Conversation

@HimanshuChoudhary-Xilinx
Copy link
Copy Markdown
Collaborator

@HimanshuChoudhary-Xilinx HimanshuChoudhary-Xilinx commented May 19, 2026

Problem solved by the commit

Certain testcases seem to use the same asm file name multiple times causing confusion during debug. aiebu should be enhanced to catch duplicate file name situation during compile and provide a proper error message.

Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered

How problem was solved, alternative solutions (if any) and why they were rejected

Risks (if any) associated the changes in the commit

What has been tested and how, request additional testing if necessary

Documentation impact (if any)

Copilot AI review requested due to automatic review settings May 19, 2026 08:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds duplicate ASM filename detection to the in-memory artifact API (file_artifact::add_vfile) and to the ASM .include preprocessor flow, plus new negative tests to validate the behavior.

Changes:

  • Throw aiebu::error(error_code::invalid_input) when adding the same virtual filename twice via file_artifact::add_vfile.
  • Track .included filenames during ASM parsing and throw invalid_input on duplicates.
  • Add new negative tests (C++ API test binary + an aiebu-asm CTest case) to exercise duplicate detection.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/cpp_test/cpp_api/CMakeLists.txt Builds/runs a new C++ negative-test executable for duplicate filename errors.
test/cpp_test/cpp_api/aiebu_error_test.cpp New C++ negative tests for duplicate add_vfile() and duplicate .include.
test/aie4-ctrlcode/CMakeLists.txt Adds the new duplicate_include ctrlcode test directory to the test suite.
test/aie4-ctrlcode/duplicate_include/CMakeLists.txt Adds a negative aiebu-asm -t aie4_config CTest intended to pass on the expected error output.
test/aie4-ctrlcode/duplicate_include/config.json Minimal config referencing main.asm for the negative include-chain test.
test/aie4-ctrlcode/duplicate_include/main.asm Entry ASM that triggers the duplicate include filename scenario.
test/aie4-ctrlcode/duplicate_include/test.asm Intermediate include file for the include-chain setup.
test/aie4-ctrlcode/duplicate_include/dup.asm Leaf ASM file intended to be included twice (triggering the duplicate error).
src/cpp/preprocessor/asm/asm_parser.h Adds per-parser tracking of seen include filenames.
src/cpp/preprocessor/asm/asm_parser.cpp Enforces duplicate include detection in .include processing.
src/cpp/assembler/aiebu_assembler.cpp Enforces duplicate virtual filename detection in file_artifact_impl::add_vfile.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +11 to +21
# PASS_REGULAR_EXPRESSION makes CTest pass when the expected error string
# appears in the tool output (regardless of exit code).
add_test(NAME "aie4_config_duplicate_include_error"
COMMAND aiebu-asm -t aie4_config
-j "${CMAKE_CURRENT_SOURCE_DIR}/config.json"
-o duplicate_include.elf
-f disabledump
WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR})

set_tests_properties("aie4_config_duplicate_include_error" PROPERTIES
PASS_REGULAR_EXPRESSION "duplicate asm file name")
if (r.passed) {
std::cout << "PASS: " << t.name << "\n";
} else {
std::cout << "FAIL: " << t.name << " – " << r.message << "\n";
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


static std::vector<char> to_buf(const std::string& s)
{
return std::vector<char>(s.begin(), s.end());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: avoid repeating the return type from the declaration; use a braced initializer list instead [modernize-return-braced-init-list]

)
             ^

{
struct tc { const char* name; result (*fn)(); };
const tc tests[] = {
{"duplicate_vfile", test_duplicate_vfile},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not declare C-style arrays, use std::array<> instead [cppcoreguidelines-avoid-c-arrays]

 tc tests[] = {
 ^

Copy link
Copy Markdown
Collaborator

@xuhz xuhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the case we experienced was,
.include foo1_folder/bar.asm
.include foo2_folder/bar.asm

functionality wise, these 2 bar.asm files are different, so everything is working fine.
but when we declare dtrace probes, we use only the bar.asm instead of the full path because in real world, the path may be very long, and very inconvenient to use otherwise.

in this case, dtrace will not work if we declare probes in bar.asm.

I would like aiebu has at least warning, or even error for this case.

the current PR seems only check the uniqueness of the full path

Comment thread src/cpp/preprocessor/asm/asm_parser.h Outdated
@sonals
Copy link
Copy Markdown
Member

sonals commented May 22, 2026

the case we experienced was, .include foo1_folder/bar.asm .include foo2_folder/bar.asm

functionality wise, these 2 bar.asm files are different, so everything is working fine. but when we declare dtrace probes, we use only the bar.asm instead of the full path because in real world, the path may be very long, and very inconvenient to use otherwise.

in this case, dtrace will not work if we declare probes in bar.asm.

I would like aiebu has at least warning, or even error for this case.

the current PR seems only check the uniqueness of the full path

Please no warning for distinct full paths -- we should fix dtrace.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Function(uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc, uint32_t col, pageid_type pagenum)
: m_file_idx(file_idx), m_name(name), m_colnum(col), m_pagenum(pagenum), m_highpc(high_pc), m_lowpc(low_pc) {}
Function(std::shared_ptr<const detail::filename_table> filename_table,
uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 4 adjacent parameters of 'Function' of similar type are easily swapped by mistake [bugprone-easily-swappable-parameters]

           uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc,
                                                       ^
Additional context

src/cpp/encoder/aie2ps/report.h:65: the first parameter in the range is 'high_pc'

           uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc,
                                                                   ^

src/cpp/encoder/aie2ps/report.h:66: the last parameter in the range is 'pagenum'

           uint32_t col, pageid_type pagenum)
                                     ^

src/cpp/encoder/aie2ps/report.h:65: after resolving type aliases, 'offset_type' and 'uint32_t' are the same

           uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc,
                                                       ^

src/cpp/encoder/aie2ps/report.h:65: after resolving type aliases, the common type of 'offset_type' and 'pageid_type' is 'unsigned int'

           uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc,
                                                       ^

src/cpp/encoder/aie2ps/report.h:66: after resolving type aliases, the common type of 'uint32_t' and 'pageid_type' is 'unsigned int'

           uint32_t col, pageid_type pagenum)
           ^

Function(uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc, uint32_t col, pageid_type pagenum)
: m_file_idx(file_idx), m_name(name), m_colnum(col), m_pagenum(pagenum), m_highpc(high_pc), m_lowpc(low_pc) {}
Function(std::shared_ptr<const detail::filename_table> filename_table,
uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: pass by value and use std::move [modernize-pass-by-value]

Suggested change
uint32_t file_idx, const std::string& name, offset_type high_pc, offset_type low_pc,
uint32_t file_idx, std::string name, offset_type high_pc, offset_type low_pc,

src/cpp/encoder/aie2ps/report.h:67:

-         : m_filename_table(std::move(filename_table)), m_file_idx(file_idx), m_name(name),
+         : m_filename_table(std::move(filename_table)), m_file_idx(file_idx), m_name(std::move(name)),

Signed-off-by: Himanshu Choudhary <Himanshu.Choudhary@amd.com>
Signed-off-by: Himanshu Choudhary <Himanshu.Choudhary@amd.com>
Signed-off-by: Himanshu Choudhary <Himanshu.Choudhary@amd.com>
@sonals
Copy link
Copy Markdown
Member

sonals commented May 29, 2026

@HimanshuChoudhary-Xilinx , can you clarify the usage of m_filename_table? Is it a singleton for an aiebu session? If not when is it created and who owns it? I see it is std::move'd to the Function object. What is the lifecycle of Function objects?

HEADER_ACCESS_GET_SET(uint32_t, linenumber);
const std::string& get_file() const { return detail::lookup_filename(m_file_idx); }
const std::string& get_file(const detail::filename_table& table) const {
return table.lookup_filename(m_file_idx);
Copy link
Copy Markdown
Member

@sonals sonals May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an operation on filename_table itself?
filename_table and file index should be unified so there is a single database.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have one database for filename and its index for one instance.
each opcode in asm record file index which when needed can be converted to filename from database lookup
i see a point to merge to one as it has redundant data but we maintain 2 just for faster lookup

class filename_table {
  std::vector<std::string> m_table;
  std::unordered_map<std::string, uint32_t> m_index;

@sonals
Copy link
Copy Markdown
Member

sonals commented May 29, 2026

As suggested above, the two databases, file name table and file index should be unified and instantiated as a singleton.

@HimanshuChoudhary-Xilinx
Copy link
Copy Markdown
Collaborator Author

@HimanshuChoudhary-Xilinx , can you clarify the usage of m_filename_table? Is it a singleton for an aiebu session? If not when is it created and who owns it? I see it is std::move'd to the Function object. What is the lifecycle of Function objects?

Hi @sonals,
m_filename_table is not singleton, its one per config instance and we can have same filenames in different instances (eg merged_control.asm).
we dont move table to function object as such but we are moving the shared_ptr.
Function object is only created when we want to add dump info in elf. Its scope is only in encoder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants