Skip to content

Review#152

Merged
dimbleby merged 10 commits into
mainfrom
review
Jun 1, 2026
Merged

Review#152
dimbleby merged 10 commits into
mainfrom
review

Conversation

@dimbleby
Copy link
Copy Markdown
Owner

@dimbleby dimbleby commented Jun 1, 2026

No description provided.

dimbleby and others added 10 commits June 1, 2026 21:29
Demoting an empty synthetic placeholder header cleared the parent's
_header_ref and ancestor refs/index but left parent._body_tail
pointing at the now doc-unlinked header. Subsequent direct KV
appends on the (now implicit) parent took the 'have a body tail'
fast-path in install_dotted_kv_slot and spliced the new KV after the
orphaned slot, so the value disappeared from the rendered document
while still living in the dict.

Set parent._body_tail = None alongside parent._header_ref = None;
that matches what _recompute_body_tail would compute for the now
header-less, body-empty container and is its original state before
the placeholder was materialised.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
_maybe_demote_synthetic_empty_header rebuilt parent._refs by filtering
and walked the ancestor chain rebuilding each grand._refs / _index by
hand. The walk dropped the binding refs from ancestor containers but
never touched the demoted header's own _refs back-pointer list, leaving
dead entries pointing at a doc-unlinked slot. That violates the
Slot._refs invariant the architecture relies on for O(depth) cleanup
elsewhere, even though no current code path revisits the orphan to
trip on the stale refs.

Replace the open-coded walk with _scrub_owned_slots_via_backptrs over
the header. The canonical helper iterates header._refs once, routing
each ref through unfile_ref — which drops parent._refs / clears
parent._header_ref via its own-header branch, drops the ancestor
binding refs from grand._refs / grand._index, and empties
header._refs. Same external behaviour, no residual back-pointers, and
the function shrinks to follow the documented bulk-scrub pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AoT.insert(idx, x) appended the new entry first and then normalised
idx = max(0, len(self) + idx) using the post-append length. The
negative-index normalisation thus landed one slot too high:
insert(-1, x) was silently identical to append(x), insert(-2, x) on
a three-entry AoT placed x at index 2 instead of 1, and so on.

Capture the pre-append length and normalise against it, matching
list.insert. Also collapse the positive-index clamp: positive
indices clamp to n_before (Python lets list.insert past-the-end
append), negative to max(0, n_before + idx).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ty no longer flags the keyword-only sort signature as
invalid-method-override, so the suppression has become
unused-ignore-comment noise. The companion mypy
'# type: ignore[override]' is still needed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
_set_eol_raw stamps an EOL section that carries its own terminating
newline, so the structural newline that used to terminate the item's
line must be removed to avoid duplication. That structural NL lives
in one of three places, depending on layout:

* inside the item's own target trivia (post_comma_trivia for
  has-comma items, trailing for no-comma items);
* on the next item's leading (has-comma, non-tail);
* in value.final_trivia (tail item, with or without trailing comma).

The downstream strip was gated on item.has_comma, so for a no-comma
terminal item (always the last) the NL in final_trivia was left in
place. Setting an EOL on item 0 of 'a = [\n    1\n]' rendered as
'a = [\n    1 # hi\n\n]\n' — an extra blank line before the
closing bracket.

Drop the has_comma gate: the existing 'nxt = items[idx + 1].leading
if idx + 1 < len(items) else value.final_trivia' fallback already
resolves to final_trivia for the no-comma tail. Track stripped via a
flag so we don't double-strip when the structural NL was inside the
item's own target.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Mirror of the set-EOL fix. ArrayEolView.__delitem__ blanked the
trailing EOL section and then conditionally restored the structural
newline downstream — but the restore was gated on item.has_comma,
so for the no-comma terminal item the structural NL the deleted EOL
had been providing was simply dropped. Deleting the only comment
from 'a = [\n    1 # hi\n]' rendered as 'a = [\n    1]\n', with
the closing bracket collapsed onto the value's line.

Drop the early return. The existing nxt fallback already resolves to
value.final_trivia for the tail item (with or without trailing
comma), which is exactly where the NL belongs for the no-comma case.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The TOML grammar restricts every digit position in date / time /
datetime literals to ASCII 0-9. The integer and float scanners
already guard with _is_ascii_digits, but the date/time paths used
str.isdigit() for detection and bare int(...) for parsing — both of
which accept Unicode decimal digits (Arabic-Indic, Devanagari, ...).
Inputs such as 'a = ٢٠٢٤-01-01' were silently accepted and decoded
to date(2024, 1, 1), even though the source contained no ASCII
digits in the year.

Apply the same _is_ascii_digits guard at every datetime/time digit
position:

* _looks_like_datetime: detection slices.
* _parse_datetime_token: the date↔time space-separator look-ahead.
* _parse_datetime_text: year / month / day before int().
* _parse_time_text: hh / mm / ss / fractional seconds before int().
* _parse_offset: tz hh / mm before int().

Add parametrised test cases covering each digit position.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
scan_eol returned a 3-tuple (trailing_ws, comment, newline) that
both parser call sites immediately unpacked and re-packed into an
EolTrivia. Build the EolTrivia inside the scanner instead — drops
the intermediate tuple, collapses a three-line type annotation to
one, and tightens each call site to 'eol = sc.scan_eol()' /
'eol=eol'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dimbleby dimbleby enabled auto-merge (rebase) June 1, 2026 21:36
@dimbleby dimbleby merged commit cde0321 into main Jun 1, 2026
11 checks passed
@dimbleby dimbleby deleted the review branch June 1, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant