Skip to content

gnd: Fix add and unify subgraph scaffolding generators#6660

Open
incrypto32 wants to merge 11 commits into
masterfrom
incrypto32/gnd-add-fix
Open

gnd: Fix add and unify subgraph scaffolding generators#6660
incrypto32 wants to merge 11 commits into
masterfrom
incrypto32/gnd-add-fix

Conversation

@incrypto32

Copy link
Copy Markdown
Member

gnd add was a drifted copy of the init/scaffold code generation, and the copy never wrote schema.graphql — so adding a data source produced a subgraph that fails codegen/build. This unifies the two paths and fixes a cluster of related codegen bugs.

The scaffolder now decides every event's names once (a ResolvedEvent) and feeds one set of generators (schema, mapping, manifest), instead of init and add each rolling their own.

What's fixed:

  • add now writes the event entities to schema.graphql (the main bug)
  • Event-name collisions across data sources: renamed with the contract prefix by default, or merged into the existing entity with --merge-entities (previously a dead flag), keeping the handler so the new contract's events are still indexed
  • Overloaded events (same name, different params) are disambiguated (Transfer, Transfer1) in both add and init
  • Small integer types (uint8, int32, …) map to GraphQL Int instead of BigInt
  • Reserved-word parameter names (new, class, …) are escaped so the generated code compiles
  • add errors on a duplicate data-source name, and only updates networks.json when it exists

Cleanup:

  • One field-name sanitizer (was three), one mapping generator (was two), shared version constants

Notes:

  • Kept gnd's existing field-name conventions (camelCase, eventId) rather than graph-cli's (internal_id) — deliberate.
  • Tuple/struct params and a codegen-time entity-vs-schema check are intentionally left for follow-up PRs.

Collapse the duplicated code generation so init and add share one path,
removing the forked add mapping generator that had drifted from the
scaffold one:

- a single sanitize_field_name in scaffold/naming.rs (three copies removed)
- a single generate_event_handlers driven by a new ResolvedEvent model
  (passthrough for now); add's copy is deleted
- shared SPEC_VERSION / MAPPING_API_VERSION constants and one to_kebab_case

No behavior change: init output is byte-identical and add is unchanged.
`gnd add` generated mappings and manifest entries that referenced entity
types it never declared, so codegen and build failed. Declare them, and
tighten two edges:

- append an entity type per new event to schema.graphql
- error when the data source name already exists
- only update networks.json when the file is present
Decide each event's entity name once in a resolution pass, then feed the
schema, mapping and manifest generators from it:

- disambiguate events overloaded within one ABI (Transfer, Transfer1)
- on collision with an existing entity, rename with the contract prefix by
  default, or under --merge-entities reuse the entity without redeclaring
  it while still generating the handler, so the new contract's events are
  indexed instead of silently dropped

Wires up the previously-dead --merge-entities flag.
Solidity integers narrow enough to fit an i32 (signed up to 32 bits,
unsigned up to 24 bits) now map to GraphQL Int, matching the generated
AssemblyScript event params; wider integers stay BigInt. Previously every
int/uint became BigInt, which mismatched the bindings.
A parameter named `new`, `class`, etc. produced an entity field the
generated AssemblyScript cannot declare. Suffix reserved words with `_`.
gnd's existing field-name conventions (camelCase, eventId/eventType) are
kept intentionally.
End-to-end check that two events sharing a name in one ABI produce
distinct entities (Ping, Ping1) and handlers (handlePing, handlePing1)
through the add path.
Move overloaded-event name disambiguation into a shared disambiguate_events
helper used by both init's generators and add's resolve_events. init
previously emitted duplicate entity types and handlers for an ABI with two
same-named events; it now suffixes repeats (Ping, Ping1) the way add does.
The mapping wrote `entity.<field> = event.params.<raw-name>`, but the ABI
codegen escapes reserved-word getters (`new` -> `new_`) and names unnamed
params `param<index>`. So a parameter named `new`, or an unnamed parameter,
produced a mapping referencing a getter that does not exist (or an empty
accessor that did not compile). Mirror the binding's getter naming on the
right-hand side via a shared event_param_accessors helper.
resolve_events treated only the manifest's mapping.entities as existing, so
a type hand-written in schema.graphql but not listed in any mapping was not
seen as a collision: add appended a duplicate `type` and codegen failed on
the redeclaration. Union the @entity types parsed from schema.graphql into
the collision set, since the schema is the real source of truth for declared
types. Falls back to manifest-only if the schema is missing or unparseable.
sanitize_field_name shipped its own RESERVED_WORDS list, a subset missing
default/for/instanceof/null/void. A param named e.g. `void` was left
unescaped as the entity field, while the schema/ABI codegen escaped the
member to `void_` via the shared list, so the mapping referenced a member
that did not exist. Escape via crate::shared::handle_reserved_word so both
sides use one list.
resolve_events checked the raw event name, so a disambiguated overload alias
(e.g. Transfer1) that collides with an existing entity was neither renamed
nor merged and silently redeclared the type. Check entity_name (the name we
actually declare) instead.
// Integer types - all map to BigInt for simplicity
t if t.starts_with("uint") || t.starts_with("int") => "BigInt",
// Integers: small widths fit in an i32 (GraphQL Int), the rest need BigInt.
t if t.starts_with("uint") || t.starts_with("int") => int_to_graphql(t),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do something similar for Int8?

@lutter

lutter commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Part of me thinks we should not do the complicated renaming on collision, but I think it would force users to not use gnd add, and adding parameters to gnd add that explicitly changes the name (like gnd add --rename Token:MyToken) seems a bit clunky.

Here's also what Claude had to say, I think it caught a few corner cases of the renaming machinery; I think they are all pretty esoteric corner cases, not sure if it's worth addressing them.

What it does: gnd add was a drifted copy of the init scaffolding logic that never wrote event entities to schema.graphql, so adding a data source produced a subgraph that failed codegen/build. This PR introduces a single ResolvedEvent model — names (handler alias, entity name, whether to declare in schema) are decided once and fed to one set of schema/mapping/manifest generators. It also fixes overloaded-event disambiguation, cross-datasource entity-name collisions (rename or --merge-entities), small-int → GraphQL Int mapping, reserved-word escaping, and duplicate-datasource-name detection. It's well-structured and well-tested; the accessor generation correctly mirrors the ABI codegen's disambiguate_event_params.

The findings below are edge cases that survived verification — all produce the exact "invalid schema → build fails" outcome the PR is trying to eliminate.

Findings

  1. gnd/src/commands/add.rs (resolve_events, ~line 303) — a renamed colliding entity isn't checked against sibling events in the same batch, producing a duplicate @entity type. (CONFIRMED)
    When an event collides with an existing entity and --merge-entities is off, it's renamed to <contract_name>, but that name is only checked against the pre-existing existing set — never against the other events being added. Concrete: existing schema declares Transfer; the new ABI for contract Token has events Transfer and TokenTransfer. Transfer collides → renamed to TokenTransfer; the ABI's own TokenTransfer event keeps entity_name = TokenTransfer. Both ResolvedEvents end up with entity_name = TokenTransfer and declare_in_schema = true, so add_schema_entities emits type TokenTransfer @entity twice → invalid schema, codegen/build fails. A HashSet of already-assigned entity names threaded through the loop (and inserted as each name is finalized) would close this.

  2. gnd/src/scaffold/schema.rs (generate_event_entity, ~line 65) & mapping.rs (generate_single_handler) — params that sanitize to the same identifier produce a duplicate entity field. sanitize_field_name has no de-duplication (unlike the new event_param_accessors, which suffixes collisions). Two unnamed params both sanitize to value; Value and value both camelCase to value. The schema then declares value: T! twice (invalid GraphQL) and the mapping emits two entity.value = … assignments. Pre-existing behavior, but it lives in the touched, now-unified generators and defeats the PR's goal; the accessor side was fixed while the field-name side was not.

  3. gnd/src/scaffold/manifest.rs (disambiguate_events, ~line 780) — the overload suffix can collide with a real event whose name already carries that suffix. An ABI with events Transfer, Transfer, Transfer1 yields aliases Transfer, Transfer1, Transfer1 (the counter for Transfer doesn't know Transfer1 also exists as a distinct name). Result: duplicate handleTransfer1 and duplicate Transfer1 entity. Rare, but it's the same broken-build failure mode and now runs on both init and add.

  4. gnd/src/commands/add.rs (schema_entity_types / entity_types_in_schema, ~line 226) — a schema that fails to parse silently drops collision detection. If graphql_tools::parser::schema can't parse the subgraph's schema.graphql, an empty set is returned and the code falls back to manifest entities lists only. A hand-written type Foo @entity not listed in any mapping's entities would then be missed, and add_schema_entities re-declares it → duplicate type. Graceful-ish (manifest-listed entities are still caught), so lowest severity, but worth a log line rather than a silent empty fallback.

Findings 1 and 2 are the ones most worth acting on before merge; 3 and 4 are low-probability hardening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants