Describe the bug
On a Python repo (~82.6k graph nodes), route / HTTP_CALLS extraction produces false and unjoinable data:
- 1062 / 1213
Route nodes are empty URL-string-literal stubs (no method/route_path).
url_path includes filesystem paths misclassified as HTTP — /root/.aws/credentials, /etc/crio/crio.conf — and str.split('/locations/') delimiters; os.remove/os.path.join are emitted as HTTP_CALLS.
- Client and server route sets are fully disjoint:
Route<-[:HANDLES]- ∩ Route<-[:HTTP_CALLS]- = 0. No client call resolves to a handled endpoint; internal URLs (http://api-admin:8001) exist only as orphan Route name strings, 0 wired.
Expected
Route extraction distinguishes HTTP URLs from filesystem paths / string-split args; client HTTP_CALLS resolve to server Routes (intra-repo at minimum).
Repro
Index any repo mixing requests/HTTP clients with os.path/file I/O, then:
MATCH (a)-[:HTTP_CALLS]->(b:Route) RETURN b.url_path
Confirmations
Describe the bug
On a Python repo (~82.6k graph nodes), route /
HTTP_CALLSextraction produces false and unjoinable data:Routenodes are empty URL-string-literal stubs (nomethod/route_path).url_pathincludes filesystem paths misclassified as HTTP —/root/.aws/credentials,/etc/crio/crio.conf— andstr.split('/locations/')delimiters;os.remove/os.path.joinare emitted asHTTP_CALLS.Route<-[:HANDLES]-∩Route<-[:HTTP_CALLS]-= 0. No client call resolves to a handled endpoint; internal URLs (http://api-admin:8001) exist only as orphan Route name strings, 0 wired.Expected
Route extraction distinguishes HTTP URLs from filesystem paths / string-split args; client
HTTP_CALLSresolve to serverRoutes (intra-repo at minimum).Repro
Index any repo mixing
requests/HTTP clients withos.path/file I/O, then:Confirmations