Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix nested vocab_size for DistillationTrainer and GOLDTrainer
#5592 opened Apr 19, 2026 by Beichen-Ma Loading…
2 of 8 tasks
feat: add TPO trainer
#5591 opened Apr 18, 2026 by JeanKaddour Draft
4 of 8 tasks
Add tiny Qwen3-4B-Instruct-2507
#5586 opened Apr 17, 2026 by qgallouedec Member Loading…
[docs] Add chat templates page to web docs
#5581 opened Apr 17, 2026 by sergiopaniego Member Loading…
8 tasks
Update AsyncGRPO example with GSM8K and tested hyperparameters
#5580 opened Apr 17, 2026 by sergiopaniego Member Loading…
8 tasks
Chunked Cross-Entropy
#5575 opened Apr 17, 2026 by qgallouedec Member Draft
Add training chat template for Qwen3-2507
#5574 opened Apr 16, 2026 by SwayamInSync Contributor Loading…
refactor: self distillation trainers (sdpo/sdft/...)
#5573 opened Apr 16, 2026 by LeonEricsson Collaborator Loading…
2 of 8 tasks
Improve BrowserGym examples for latest OpenEnv version
#5568 opened Apr 16, 2026 by sergiopaniego Member Loading…
8 tasks
Set _tokenizer attribute in experimental trainers
#5566 opened Apr 16, 2026 by albertvillanova Member Loading…
DataCollatorForPreference checking 'margin' in all examples
#5564 opened Apr 15, 2026 by antoinsader Loading…
5 of 8 tasks
Revert VLM support in parse_response
#5561 opened Apr 15, 2026 by qgallouedec Member Loading…
Accept processor in get_training_chat_template
#5560 opened Apr 15, 2026 by qgallouedec Member Loading…
Check prefix preservation at the token level
#5559 opened Apr 15, 2026 by qgallouedec Member Loading…
Move experimental example scripts into their trainer folders
#5556 opened Apr 15, 2026 by sergiopaniego Member Loading…
1 of 8 tasks
Add support for prompt-completion format in DistillationTrainer
#5555 opened Apr 15, 2026 by cmpatino Collaborator Loading…
3 of 6 tasks
Fix GRPO VLM tests: Multimodal training requires conversational prompts
#5550 opened Apr 15, 2026 by kaixuanliu Contributor Loading…
3 tasks done
Drop vLLM 0.11 support
#5549 opened Apr 14, 2026 by qgallouedec Member Loading…
Differentiate Phi-3 and Phi-3.5 in tests
#5546 opened Apr 14, 2026 by qgallouedec Member Loading…
fix: Pass AsyncGRPOTrainer's processing_class to AsyncRolloutWorker
#5538 opened Apr 14, 2026 by xuanduy04 Contributor Loading…
2 of 8 tasks
feat: add Phi-3 training chat template with generation markers
#5526 opened Apr 12, 2026 by RudrenduPaul Contributor Loading…
2 of 4 tasks
feat: add Gemma/Gemma2 training chat templates with generation markers
#5523 opened Apr 11, 2026 by ps-abhi Loading…
5 of 8 tasks
feat(glm-4-moe): Add {% generation %} markers for training chat template
#5519 opened Apr 10, 2026 by casinca Contributor Loading…
5 of 8 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.