Skip to content

fix(discover): deploy/ directories indexed as duplicate source in full mode#448

Open
isc-tdyar wants to merge 1 commit into
DeusData:mainfrom
isc-tdyar:fix/deploy-dir-always-skip
Open

fix(discover): deploy/ directories indexed as duplicate source in full mode#448
isc-tdyar wants to merge 1 commit into
DeusData:mainfrom
isc-tdyar:fix/deploy-dir-always-skip

Conversation

@isc-tdyar

Copy link
Copy Markdown

deploy/ directories indexed as duplicate source in full mode, producing stale duplicate Class/Method rows

Discovered while querying a project that uses a deploy/ directory for build output. Queries like MATCH (c:Class {name:'Foo'}) returned multiple rows for the same class — one from the source file, one from the copy in deploy/.

Root cause

deploy and deployed were in FAST_SKIP_DIRS, which is only consulted in fast and moderate indexing modes. In the default full mode, FAST_SKIP_DIRS is not applied, so deploy/ was walked and indexed alongside source files. Any class that existed in both locations got duplicate rows in the graph with potentially stale data from the build copy.

// before: deploy/deployed only skipped in fast/moderate mode
static const char *FAST_SKIP_DIRS[] = { ..., ".vercel", ".netlify", NULL };

// after: always skipped
static const char *ALWAYS_SKIP_DIRS[] = { ..., ".vercel", ".netlify", "deploy", "deployed", NULL };

Fix

Move deploy and deployed from FAST_SKIP_DIRS to ALWAYS_SKIP_DIRS so they are excluded in all indexing modes.

Regression test

discover_deploy_excluded_full_mode in tests/test_discover.c — creates a temporary repo with src/, deploy/, and deployed/ subdirectories, runs discovery in full mode, and asserts only the file under src/ is returned (deploy and deployed are excluded).

…l mode

deploy/ and deployed/ build artifact directories were in FAST_SKIP_DIRS,
which only applies in fast/moderate indexing modes. In the default full
mode they were indexed alongside source files, creating duplicate
Class/Method rows with stale data. Any MATCH (c:Class {name:'Foo'})
without a file_path filter returned multiple rows — one from source,
one from the build copy.

Move deploy and deployed to ALWAYS_SKIP_DIRS so they are excluded in all
indexing modes.

Regression test: discover_deploy_excluded_full_mode in test_discover.c.

Signed-off-by: Thomas Dyar <tdyar@intersystems.com>
@DeusData

Copy link
Copy Markdown
Owner

Thanks for the contribution — all gates are green including DCO, and the deploy/-directory duplication is a good catch. Will review when time allows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants