Add CLAUDE.md and improve docs for multi-file batch loading#56
Merged
martinv13 merged 6 commits intoMay 28, 2026
Merged
Conversation
- Fix bug in getting_started.md example: used hardcoded path instead of loop variable - Expand the note into a titled admonition explaining when/why to use it - Add metadata usage to the example, since per-file metadata is the primary use case - Mention cross-file deduplication benefit - Clarify flat_data docstring on DataModel.parse_xml and Document.parse_xml https://claude.ai/code/session_01Qjy5fqCLtxaEhjsUjdjV3R
…uplication claim Revert flat_data docstring changes in DataModel.parse_xml and Document.parse_xml. Remove incorrect claim that batching enables cross-file deduplication — deduplication is always cross-file since it is based on a deterministic content hash. https://claude.ai/code/session_01Qjy5fqCLtxaEhjsUjdjV3R
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
CLAUDE.mdwith development commands (including in-memory DuckDBas the default way to run the full test suite) and a high-level
architecture overview covering the table hierarchy, dialect system, and
snapshot test workflow.
Fix a bug in the
getting_started.mdmulti-file loading example wherethe loop variable
xml_filewas replaced by the hardcoded string"path/to/file.xml", causing the same file to be re-parsed on everyiteration.
Expand the multi-file loading note with a title, an explanation of when
and why to use the pattern, and a
metadataargument in the example(the primary reason for per-file tracking).
Clarify the
flat_dataparameter docstring on bothDataModel.parse_xmland
Document.parse_xmlto explain what to pass and what it enables.