Skip to content

Drop empty demos by default in merge_demos.py#728

Open
rwiltz wants to merge 1 commit into
mainfrom
rwiltz/hdf5-merge-script-drop-empty
Open

Drop empty demos by default in merge_demos.py#728
rwiltz wants to merge 1 commit into
mainfrom
rwiltz/hdf5-merge-script-drop-empty

Conversation

@rwiltz
Copy link
Copy Markdown
Collaborator

@rwiltz rwiltz commented May 27, 2026

Summary

Drop empty demos by default in merge_demos.py

Detailed description

  • Empty demos (no actions dataset or actions.shape[0] == 0) are now skipped automatically during the merge. They break replay_demos.py and the LeRobot conversion downstream, so there is no reason to keep them in a training-ready dataset. Dropped demos are listed by source file in the summary, and survivors renumber contiguously.
  • Adds per-demo schema fingerprinting so intra-file inconsistencies (e.g. one demo missing the camera observation the others have) are caught as hard errors instead of slipping through the previous "first demo only" check.
  • Auto-creates the output parent directory, and aborts with a clean error if every input demo is empty, so operators never end up with an unusable zero-demo dataset. The single-file invocation is still supported and now doubles as an empty-demo sanitizer for one half-finished session.
  • Updates Step 4b of the static-apple teleop doc to reflect the new default; expanded the test suite to 28 cases covering the auto-drop path, the all-empty rejection, intra-file mismatch detection, and the single-file-with-empties case.

Copy link
Copy Markdown
Contributor

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR improves the robustness of merge_demos.py by automatically filtering out empty demos and detecting intra-file schema inconsistencies. The changes are well-structured, and the test coverage is comprehensive.

What Works Well

  • Clear separation between detection (in _inspect_file) and filtering (in _merge)
  • Good defensive handling for edge cases (all-empty inputs, single-file sanitization)
  • Excellent test coverage for the new functionality
  • Documentation updates are thorough and clear

Summary

Overall a solid PR. The minor suggestions below are mostly about clarity and edge-case logging. None block merge.


# "Empty" = unusable for replay / training. Either there is no actions dataset at
# all, or it has zero timesteps. record_demos.py shouldn't normally export these,
# but partial recordings and interrupted sessions can produce them.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion (minor): The intra-file schema check builds a fingerprint for every non-first demo, which is O(keys × demos). For files with many demos (>100) this could add noticeable overhead. Consider short-circuiting after the first mismatch is found since one is enough to raise an error:

if name != demo_names[0] and not intra_file_schema_mismatches:
    demo_fp = _build_schema_fingerprint(demo)
    if demo_fp != schema_fingerprint:
        intra_file_schema_mismatches.append(name)

Alternatively, if you want to report all mismatches, this is fine as-is, but perhaps add a comment noting the intentional choice.

untracked_step_demos: list[str] = []
empty_demo_names: list[str] = []
intra_file_schema_mismatches: list[str] = []
for name in demo_names:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: When "actions" is missing entirely, actions_len becomes 0, which is correct. However, the empty-demo detection below distinguishes between "no actions dataset" and "zero-length actions." Consider logging which case was hit for debugging purposes (e.g., missing actions dataset vs actions.shape[0] == 0), since these can indicate different recorder failure modes.

report.schema_status = "MISMATCH"
report.errors.append(
f"{i.path}: {len(actionable)} demo(s) ({shown}{suffix}) have a key/shape/dtype"
" set that differs from demo_0 in the same file (e.g. a missing camera"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observation: The error message suggests "drop them upstream or re-record," but this PR now auto-drops empty demos. Consider clarifying that non-empty demos with schema mismatches require manual intervention, while empty mismatches are auto-handled.

with h5py.File(path, "r+") as f:
demo = f["data"][demo_name]
del demo["actions"]
demo.create_dataset(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test gap (minor): Consider adding a test case for a demo with actions present but containing only NaN or Inf values. While technically non-empty (shape > 0), such demos may also be unusable for training. This is outside the current PR scope but worth noting for future hardening.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 27, 2026

Greptile Summary

This PR adds automatic empty-demo filtering to merge_demos.py so that demos with no actions dataset or zero-length actions are silently dropped (with a summary log) rather than passed through to break replay_demos.py and LeRobot conversion. It also adds per-demo intra-file schema fingerprinting to catch single-demo corruption within one recording session, and aborts cleanly when every input demo is empty.

  • Empty-demo drop: _inspect_file classifies each demo as empty (actions missing or actions.shape[0] == 0), _merge skips them via skip_set, survivors are renumbered contiguously, and the dropped list is threaded back to _print_summary.
  • Intra-file schema check: each demo (after demo_0) is fingerprinted and compared to demo_0's schema; mismatches on non-empty demos become hard errors, while mismatches on empty demos (which will be dropped anyway) are suppressed.
  • Test suite: 6 new cases cover the auto-drop path, single-file sanitization, all-empty rejection, intra-file mismatch detection, and the "empty-AND-mismatch" suppression path; existing failure-cleanup tests updated to accept the new 4-tuple return value from _merge.

Confidence Score: 3/5

Safe to merge for the common case, but one edge-case will incorrectly block a valid merge when the very first demo in a file has its actions key entirely absent.

The empty-filtering logic works correctly when empty demos carry a zero-length actions dataset (the normal interrupted-recording shape), but when demo_0 has no actions key at all, its schema fingerprint is built without that key. Every valid demo behind it then looks like an intra-file schema mismatch; those demos are not in empty_set, so they become "actionable" errors and the merge aborts — blaming the good demos. This is a realistic failure path (a session whose very first recording attempt never wrote a single step), it's not covered by the new tests, and it could frustrate operators who are trying to sanitize exactly this kind of partial-session file.

merge_demos.py lines 152–184: the reference fingerprint and intra-file mismatch loop need a second look for the case where demo_0 itself is empty via missing actions key.

Important Files Changed

Filename Overview
isaaclab_arena/scripts/imitation_learning/merge_demos.py Adds empty-demo filtering, intra-file schema fingerprinting, and all-empty-abort guard; one edge case: reference fingerprint taken from demo_0 before empty-demo classification, producing false schema errors when demo_0 itself has no actions key.
isaaclab_arena/tests/test_merge_demos.py Adds 6 new test cases covering auto-drop, all-empty rejection, intra-file mismatch detection, and single-file-with-empties; only exercises zero-length-actions empties, leaving the no-actions-key-in-demo_0 path untested.
docs/pages/example_workflows/static_apple/step_2_teleoperation.rst Documentation-only update: adds a note about automatic empty-demo filtering and updates the schema-error description to cover intra-file mismatches; no issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Parse args & validate paths] --> B[_inspect_file for each input]
    B --> C{File readable & has demo_* groups?}
    C -- No --> ERR1[Return exit code 2]
    C -- Yes --> D[Build schema fingerprint from demo_0\nLoop over all demos:\n- count steps\n- detect empty demos\n- detect intra-file schema mismatches]
    D --> E[_validate_compatibility]
    E --> F{format_version / cross-file schema / intra-file mismatches?}
    F -- Hard errors --> ERR2[Print summary + error, return 1]
    F -- OK / warnings only --> G{All demos empty?}
    G -- Yes --> ERR3[Print summary + ERROR every demo empty, return 1]
    G -- No --> H{--dry_run?}
    H -- Yes --> I[Print summary, return 0/1]
    H -- No --> J[os.makedirs output dir]
    J --> K[_merge to .tmp file\nSkip demos in empty_demo_names\nRenumber survivors contiguously\nRecompute data.attrs.total]
    K --> L{Exception?}
    L -- Yes --> ERR4[Cleanup .tmp, return 1]
    L -- No --> M[os.replace .tmp to output]
    M --> N[_print_summary with dropped list]
    N --> O[Return 0]
Loading

Reviews (1): Last reviewed commit: "Adding logic to drop empty teleop record..." | Re-trigger Greptile

Comment on lines +152 to 156
# Sample the file-level reference schema from the first demo. record_demos.py writes
# a uniform recorder stack within a single file, so demo_0 is a reasonable reference;
# the per-demo loop below catches the rare case where a single demo in the file
# diverges (e.g. partial recording, manual edit).
schema_fingerprint = _build_schema_fingerprint(data_group[demo_names[0]])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 False intra-file schema error when demo_0 is the empty demo (missing actions key)

The reference fingerprint is computed from demo_names[0] (line 156) before the loop that classifies empty demos. If demo_0 has no actions key at all (not merely a zero-length actions array, but the key is entirely absent — e.g. from a very early interrupted recording), schema_fingerprint will not contain actions. Every subsequent valid demo does have actions, so it is flagged as a schema mismatch. Those valid demos land in intra_file_schema_mismatches but are not in empty_set, making them "actionable". The merge aborts with a false error naming the good demos as the problem while the actual culprit (demo_0) is never called out.

The existing test suite only exercises the zero-length-actions path via _truncate_demo_actions (which keeps the actions key with shape (0, action_dim) so shape[1:] matches), so this scenario has no coverage. A recording session where the very first demo never wrote a single step is a realistic edge case for interrupted teleop sessions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant