HIL-SERL tutorial for simulation fixed and expanded by spirosperos · Pull Request #1 · SpesRobotics/lerobot

spirosperos · 2025-08-06T10:55:18Z

What this does

This PR fixes two critical bugs in the HIL-SERL simulation training framework and adds comprehensive documentation:

Bug Fixes:

🐛 Bug Fix: Fixed control_time_s parameter not being respected during training - episodes were always 10 seconds regardless of configuration
🐛 Bug Fix: Fixed random_block_position flag not being properly passed to the gym environment, preventing cube randomization

Improvements:

📚 Documentation: Added comprehensive training guide (hil_serl_simulation_training_guide_README.md) for HIL-SERL simulation training
🔧 Enhancement: Added extensive debug logging for easier training monitoring and troubleshooting

Key Changes:

Implemented TimeLimitWrapper class to properly enforce episode time limits based on control_time_s configuration
Added random_block_position parameter to environment configuration and properly passed it to gym environment
Enhanced logging throughout training process with detailed episode progress, time tracking, and environment state information

How it was tested

Time Limit Fix: Verified that setting "control_time_s": 40.0 in configuration now properly limits episodes to 40 seconds instead of default 10 seconds
Cube Randomization Fix: Confirmed that "random_block_position": true now properly randomizes cube positions between episodes
Debug Logging: Tested debug outputs during training to ensure proper monitoring of episode progress, time remaining, and environment state
Documentation: Verified all commands and configurations in the new README work correctly with the fixed implementation

Test Commands:

# Test recording with new time limit
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json

# Test training with both fixes
python -m lerobot.scripts.rl.learner --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json
python -m lerobot.scripts.rl.actor --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json

How to checkout & try? (for the reviewer)

Test Time Limit Fix:

# Modify control_time_s in hi_rl_test_gamepad.json to 20.0 and verify episodes last 20 seconds
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json

Test Cube Randomization:

# Set random_block_position to false in config and verify cube stays in same position
# Set to true and verify cube randomizes between episodes
python -m lerobot.scripts.rl.gym_manipulator --config_path examples/hil_serl_simulation_training/hi_rl_test_gamepad.json

Test Training with Both Fixes:

# Terminal 1: Start learner
python -m lerobot.scripts.rl.learner --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json

# Terminal 2: Start actor  
python -m lerobot.scripts.rl.actor --config_path examples/hil_serl_simulation_training/train_gym_hil_env_gamepad.json

Review Documentation:

# Check the new training guide
cat examples/hil_serl_simulation_training/hil_serl_simulation_training_guide_README.md

Expected Behavior:

Episodes should respect the control_time_s setting (30s in training config, 40s in recording config)
Cube should randomize position when random_block_position: true
Debug logs should show detailed episode progress and time tracking

…ggingface#1608)

…ch (huggingface#1597)

…1604)

huggingface#1593)

…ngface#1609) * fix(policies): remove action from batch for offline evaluation in diffusion, tdmpc, and vqbet policies * style(diffusion): correct comment capitalization for clarity in modeling_diffusion.py

* fix bug about sampling t from beta distribution * fix: address review comments ---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>

…gingface#1612)

…uggingface#1611)

…ngface_hub dependencies (huggingface#1618)

…ould crash with exception, fix environment state docs (huggingface#1617) * Fix bug in diffusion config validation when not using image features * Fix DiffusionPolicy docstring about shape of env state

…both OpenCV and RealSense camera implementations

Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

…face#1643) * chore(ci): add some release stuff * chore(ci): add requirements-macos * chore(ci): added lockfiles for future reference * feat(ci): add draft & prerelease option to release workflow tag

* Cleanup badges * Remove comment * Remove profiling section * Move acknowledgment * Move citations * Fix badge display * Move build your robot section * Fix nightly badge * Revert be13b3f * Update README.md Co-authored-by: HUANG TZU-CHUN <tzu.chun.huang.tw@gmail.com> Signed-off-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> * chore(docs): optimize readme for PyPI rendering * chore(docs): move policy readme to docs folder + symlink in policy dirs * fix(docs): max width og lerobot logo + url in citation block --------- Signed-off-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> Co-authored-by: HUANG TZU-CHUN <tzu.chun.huang.tw@gmail.com> Co-authored-by: Steven Palma <steven.palma@huggingface.co>

…uggingface#1648)

* add: test to check proper construction with multiple features with STATE/ACTION type * fix: robot and action state should match policy's expectations * fix minor Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com> --------- Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

…te URLs (huggingface#1313) * Update links to use absolute URLs. * Update dataset upload example link to use HF_USER variable and match the correct syntax.

lukicdarkoo · 2025-08-08T09:07:07Z

There is a mess up with the branches, fixed in #2

CarolinePascal and others added 30 commits July 28, 2025 11:09

fix(hf hub dependency): adding ceiling version on huggingface_hub (hu…

f089ab3

…ggingface#1608)

smolfix(vla): typing and fix offline inference when action in the bat…

615adfc

…ch (huggingface#1597)

bump wandb version to be compatible with ne grpcio-deps (huggingface#…

98746c7

…1604)

chore(pi0fast): TODO comment to warn the need for removal ignore_index (

b61a4de

huggingface#1593)

docs/style: updating docs and deprecated links (huggingface#1584)

664e069

fix(policies): remove action from batch for offline evaluation (huggi…

c3d5e49

…ngface#1609) * fix(policies): remove action from batch for offline evaluation in diffusion, tdmpc, and vqbet policies * style(diffusion): correct comment capitalization for clarity in modeling_diffusion.py

fix bug about sampling time from beta distribution (huggingface#1605)

4b88842

* fix bug about sampling t from beta distribution * fix: address review comments ---------

fix(config): typing correction on config.py (huggingface#1320)

7fe6ada

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>

fix(tokenizers dependency): adding ceiling version on tokenizers (hug…

b267cd4

…gingface#1612)

Fix sample beta for smolvla as done for pi0, remove sample_beta func (h…

c7c3b47

…uggingface#1611)

fix(dependencies): removing versions ceilings on tokenizers and huggi…

c14ab9e

…ngface_hub dependencies (huggingface#1618)

fix(DiffusionPolicy): Fix bug where training without image features w…

5695432

…ould crash with exception, fix environment state docs (huggingface#1617) * Fix bug in diffusion config validation when not using image features * Fix DiffusionPolicy docstring about shape of env state

fix(180-degree rotation): Add cv2.ROTATE_180 to rotation checks in …

67196c9

…both OpenCV and RealSense camera implementations

Fix pi0 checkpoint state map (huggingface#1415)

71eff18

Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>

$@fracapuano$

fix colab typo (huggingface#1629)

945e1ff

Signed-off-by: Francesco Capuano <74058581+fracapuano@users.noreply.github.com>

fix(ci): declare entrypoints + fix testing release (huggingface#1642)

91ed609

feat(ci): release workflow publish to pypi test + lock files (hugging…

1baaa77

…face#1643) * chore(ci): add some release stuff * chore(ci): add requirements-macos * chore(ci): added lockfiles for future reference * feat(ci): add draft & prerelease option to release workflow tag

fix(ci): change steps based on wheter it is a -rc tag (huggingface#1646)

11525ce

fix(ci): change release-name to title (huggingface#1647)

dcb305f

fix(ci): use base tag for testpy to mimic the pyproject.toml version (h…

60dc8e3

…uggingface#1648)

chore(ci): Bump to v0.3.0 (huggingface#1649)

3e24eca

fix(ci): remove uv run + bump minor (huggingface#1651)

240a389

fix(ci): create venv for release testing (huggingface#1652)

f771e3e

chore: Bump to 4.0.0 (huggingface#1653)

8c57752

fix(docs): Update links in il_robots.mdx and il_sim.mdx to use absolu…

e0096fe

…te URLs (huggingface#1313) * Update links to use absolute URLs. * Update dataset upload example link to use HF_USER variable and match the correct syntax.

fix(typo): fixing typo in LeRobot authors names (huggingface#1673)

06bebd9

hil serl sim tutorial fixed and upgraded

99a0683

fix

b763d0e

spirosperos added 5 commits August 6, 2025 15:54

formatting

1017665

fix

b1e8a05

fix

bb01d8e

fix

c5b8238

Fix missing newlines at end of files

5303eb7

lukicdarkoo closed this Aug 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HIL-SERL tutorial for simulation fixed and expanded#1

HIL-SERL tutorial for simulation fixed and expanded#1
spirosperos wants to merge 35 commits intoSpesRobotics:mainfrom
spirosperos:feature-hil-serl-sim-tutorial

spirosperos commented Aug 6, 2025 •

edited

Loading

Uh oh!

lukicdarkoo commented Aug 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Conversation

spirosperos commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

How it was tested

How to checkout & try? (for the reviewer)

Uh oh!

lukicdarkoo commented Aug 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

spirosperos commented Aug 6, 2025 •

edited

Loading