|
|
|
Custom iconic hero artwork for IronEngine-RL placed alongside the original iconic image.
A configurable robotics intelligence framework that connects AI models, robot platforms, safety boundaries, and evaluation workflows through standardized interfaces.
IronEngine-RLis the configurable brain layer between AI models and robot platforms. It connects perception, inference, repository context, decision making, safety boundaries, and actuation through standardized interfaces so users can change models, transports, agents, and hardware without rewriting the full stack.
IronEngine-RL is designed for both research and engineering workflows: deterministic simulation, replay-assisted evaluation, mock hardware validation, real hardware integration, prompt-driven LLM backends, trainable PyTorch models, heuristic baselines, and plugin-loaded custom modules.
| Benefit | What you gain quickly |
|---|---|
| Lightweight runtime | Start with a lean core path instead of a heavyweight orchestration stack |
| Local-model friendly | Try practical local AI workflows on modest machines with smaller models and compact backends |
| Cross-platform structure | Carry the same framework approach across Windows, Linux, and macOS deployments |
| Scalable complexity | Add persistence, richer models, extra sensors, and HIL only when the project needs them |
Quick Links: Documentation Map · Software Environment Setup · Quick Start · Choose Your Starting Path · Runtime Modes at a Glance · ARMSmart and Custom Robot Integration · NiusRobotLab
Note
IronEngine-RL aims to make robotics stacks easier to change without making them easier to break. The framework is intentionally organized around contracts, validation, safety boundaries, and explicit module surfaces.
Brand assets included in this repository
assets/IE-RL-hero.svg- primary README hero artworkassets/IE-RL-banner.svg- wide banner asset for docs, slides, or previewsassets/IE-RL-mark.svg- compact emblem-style markassets/IE-RL.jpg- original image kept as a reference asset
| Highlight | Why it matters | Where to look |
|---|---|---|
| Swappable intelligence layer | Swap heuristic, PyTorch, local LLM, cloud LLM, or custom providers without rewriting the full runtime | examples/inference/ and docs/api-reference.md |
| Explicit hardware contract | Keep action channels, feedback fields, camera roles, safety, and protocol design visible in profiles | examples/hardware/ and docs/custom-robots-and-sensors.md |
| Lightweight by default | Start with a simple in-memory repository and add persistence only when needed, keeping the core runtime lean | examples/plugins/persistent_repository/profile.json |
| Low local-model barrier | Run practical local-model workflows with modest hardware by starting from compact prompt backends and lightweight control paths instead of assuming large GPU servers | examples/inference/armsmart_ollama/profile.json and docs/software-setup.md |
| Wide platform compatibility | Use the same framework patterns across Windows, Linux, and macOS, then adapt transports and hardware adapters per deployment | docs/software-setup.md and docs/custom-robots-and-sensors.md |
| Complete end-to-end examples | Study richer ARMSmart examples that combine tasks, metrics, repositories, action schemes, and custom inference | examples/inference/armsmart_*_complete/ |
| Developer-facing documentation | Use dedicated API and developer docs instead of relying only on source reading | docs/api-reference.md and docs/developer-guide.md |
- Safety stays outside the model: battery, collision, stale-observation, and motion limits remain enforceable even when inference is wrong
- Profiles are first-class: the main user-facing configuration lives in profiles, not hidden wiring scattered across the codebase
- Additive clarity over breaking renames: aliases such as
command,feedback,results, andaction_schememake the framework easier to understand without breaking core types - Examples teach the framework: runnable examples are treated as part of the documentation, not as throwaway demos
- Persistence is opt-in: repository and database workflows are available, but the core runtime remains lightweight
IronEngine-RL is inspired by the framework interpretation of Figure 7.1 and by the dissertation:
Mo, X. (2022). Convolutional Neural Network in Pattern Recognition (Doctoral dissertation, University of Kansas).
The Figure 7.1 reference asset for this repository is stored in assets/figure-7-1-framework.pdf. This project takes inspiration from the sources above and turns that conceptual framework into a configurable software system for AI-guided robotics.
Figure 7.1 reference overview used as inspiration for the IronEngine-RL architecture.
| Area | What it provides |
|---|---|
| Runtime core | A configurable runtime between AI reasoning modules and robots |
| Contracts | Standardized observation, action, camera-role, safety, and repository-context interfaces |
| Validation | Schema and compatibility checks before strict execution |
| Extensibility | Pluggable inference providers, agents, metrics, safety policies, and update strategies |
| Deployment path | A clean path from mock bring-up to hardware-in-the-loop and real robot deployment |
| Reference platform | A complete ARMSmart integration plus generalized templates for other robots |
| Efficiency profile | A lightweight default stack that avoids forcing persistence layers, oversized orchestration, or heavy model serving for every use case |
| Practical local AI path | Support for compact local-model workflows so teams can prototype without high-end server hardware |
| Platform reach | Cross-platform project structure that can be adapted for Windows, Linux, and macOS environments |
- For research: compare heuristic, prompt-driven, and trainable policies within one consistent runtime
- For hardware integration: standardize sensors, action channels, timing rules, and safety boundaries before motion
- For iteration speed: move from simulation to replay, mock hardware, HIL, and real hardware without rewriting the whole control stack
- For extensibility: add custom plugins for inference, agents, metrics, safety policies, and update strategies
- For maintainability: keep robot-specific details in profiles, adapters, and plugins instead of scattering them across the project
- For lightweight deployment: keep the default runtime lean so smaller robots, edge PCs, and lab workstations are not forced into heavyweight infrastructure
- For accessible local AI: begin with smaller local-model setups and compact prompt workflows before deciding whether larger hardware is justified
- For wide compatibility: use one framework structure across multiple operating systems and deployment styles, then tailor adapters instead of forking the architecture
| Benefit | What this means in practice |
|---|---|
| Lightweight runtime | The framework can start with mock hardware, rule-based control, or small local-model integrations without requiring a large orchestration stack |
| Lower hardware requirements for local models | Users can evaluate local LLM or prompt-style workflows on comparatively modest machines by choosing smaller models and lightweight backends such as Ollama or LM Studio-compatible paths |
| Efficient customization path | Profiles, plugins, and scaffolded configs reduce the amount of custom glue code needed to integrate a new robot or AI backend |
| Wide platform compatibility | The framework organization is portable across common development environments, while transports and adapters absorb most platform-specific differences |
| Gradual scaling | Teams can begin with simple baselines, then add persistence, richer models, extra sensors, or hardware-in-the-loop only when the project actually needs them |
| Use case | What teams do with IronEngine-RL |
|---|---|
| Research labs | Compare multiple control or reasoning backends against the same observation and action contracts |
| Robot builders | Bring up new sensors, MCU firmware, and motion interfaces safely through mock and HIL stages |
| AI integration teams | Swap local LLMs, cloud APIs, and trainable policies without rewriting the robot-facing runtime |
| Education and prototyping | Demonstrate end-to-end AI-guided robotics workflows with examples that are easier to explain and modify |
| Audience | What you likely care about | Best first stop |
|---|---|---|
| Robotics researchers | comparing reasoning backends with a stable runtime contract | docs/framework-architecture.md |
| Embedded and robot integration engineers | transport, protocol, telemetry shape, and safety control | examples/hardware/armsmart/profile.mock.json |
| ML and LLM developers | custom providers, prompt-driven workflows, adaptive policies, and repository context | examples/inference/ and docs/api-reference.md |
| Framework extenders | plugin surfaces, testing, and contribution flow | docs/developer-guide.md |
| Path | Purpose |
|---|---|
src/ironengine_rl/ |
Core framework implementation |
profiles/ |
Reusable reference profiles for scaffolding, validation, tests, and canonical starting configurations |
examples/ |
Runnable scenario-specific profiles that show complete hardware, inference, or plugin workflows |
user_modules/ |
Custom and example plugin modules |
docs/ |
Supporting documentation beyond this main README |
assets/ |
Figure and reference assets |
tools/ |
Optional helper scripts and developer utilities when the project needs repository-maintenance helpers |
Use profiles/ when you want the framework's reusable baseline configurations. These profiles are intended to be stable reference inputs for validation, scaffolding, automated tests, and quick adaptation into your own robot or inference setup.
Use examples/ when you want a more opinionated, runnable demonstration such as ARMSmart mock hardware, local/cloud LLM integration, custom PyTorch stacks, or repository plugins working together end to end.
A practical rule is: start by reading or validating a file in profiles/, then move to examples/ when you want a complete scenario, and finally copy either into your own custom profile once the workflow is clear.
IronEngine-RL/
├─ assets/
├─ docs/
├─ examples/
├─ logs/
├─ profiles/
├─ src/ironengine_rl/
├─ tests/
├─ user_modules/
├─ .gitignore
├─ pyproject.toml
├─ README.md
└─ requirements.txt
The tools/ folder is optional and is not present in the current tree. Add it only when you intentionally need helper scripts such as migration tools, release automation, converters, or local developer utilities.
IronEngine-RL is closely related to the broader IronEngine project, but it is intentionally narrower in scope.
IronEngine is described as a next-generation universal AI assistant that connects local and cloud agents to real-life operations across common operating systems. IronEngine-RL takes a simplified core idea from that project—the configurable connection between AI reasoning, real-world interfaces, and operational control—and optimizes it for reinforcement-learning-oriented robotics workflows.
In practice, this means IronEngine-RL keeps the parts that matter most for robot learning and control loops:
- standardized observation, action, safety, and repository interfaces
- explicit profile-driven configuration instead of hidden orchestration
- modular inference backends such as heuristic, PyTorch, local LLM, cloud LLM, and custom plugins
- validation, task evaluation, and boundary enforcement around every run
- a practical path from simulation to mock hardware, HIL, and real robot deployment
At the same time, IronEngine-RL is deliberately lighter than the broader IronEngine vision. It does not try to be a general-purpose universal assistant runtime. Instead, it specializes the architecture around robotics, task execution, policy iteration, reward-driven analysis, and repeatable experiment workflows.
IronEngine- broader universal AI assistant directionIronEngine-RL- reinforcement-learning and robotics-focused specialization built around a simplified core framework idea- shared spirit - modular AI integration, practical deployment, and local/cloud flexibility
- different optimization target -
IronEngine-RLis tuned for task-oriented robot control, evaluation, safety, and experimentation
So if IronEngine represents the larger vision, IronEngine-RL represents a focused engineering branch optimized for reinforcement learning, robot task execution, and configurable embodied AI experiments.
| Document | Purpose |
|---|---|
docs/index.md |
Documentation entry point |
docs/software-setup.md |
Python environment setup, dependency roles, external tools, and bring-up notes |
docs/troubleshooting.md |
Common setup, validation, plugin, LLM, PyTorch, and hardware bring-up issues |
docs/profiles-and-configuration.md |
How profiles are structured and how to edit them effectively |
docs/anomaly-detection-and-safety.md |
How anomaly signals flow through inference, safety, and customization paths |
docs/examples-matrix.md |
Feature comparison across the main example profiles |
docs/api-reference.md |
Public runtime APIs, datamodels, extension ports, and CLI surfaces |
docs/developer-guide.md |
Detailed guidance for framework developers, extenders, and maintainers |
docs/repository-layout.md |
Repository structure and the purpose of tools/ |
docs/framework-architecture.md |
Architecture, framework philosophy, and design surfaces |
docs/figure-7-1-mapping.md |
Direct mapping from Figure 7.1 concepts to framework modules, aliases, and examples |
docs/customization.md |
Customization patterns for modules, contracts, ARMSmart, and scaffolding |
docs/custom-robots-and-sensors.md |
Practical requirements for customized robots, sensors, interfaces, and MCUs |
docs/examples-and-workflows.md |
Example catalog and recommended user path |
docs/plugins-and-extensions.md |
Plugin organization and extension points |
docs/logging-and-outputs.md |
How runtime outputs should be organized |
docs/references.md |
Dissertation citation and figure reference |
Suggested reading paths
- I want the big picture:
README.md→docs/framework-architecture.md→docs/figure-7-1-mapping.md - I want to build or extend the framework:
docs/api-reference.md→docs/developer-guide.md→docs/plugins-and-extensions.md - I want to understand profiles first:
docs/profiles-and-configuration.md→docs/examples-and-workflows.md→profiles/ - I want to run examples quickly:
README.md→docs/examples-and-workflows.md→docs/examples-matrix.md→examples/ - I want to integrate hardware:
docs/custom-robots-and-sensors.md→examples/hardware/→docs/logging-and-outputs.md - I want to customize anomaly handling:
docs/anomaly-detection-and-safety.md→examples/plugins/anomaly_customization/profile.json - I am stuck on setup or validation:
docs/troubleshooting.md→docs/software-setup.md→docs/developer-guide.md
| Backend type | Style | Best use | Requires weights | Requires external service |
|---|---|---|---|---|
rule_based |
Heuristic baseline | Deterministic reference behavior and simple validation | No | No |
linear_policy |
Lightweight trainable weights | Small framework-managed control policies | Yes | No |
pytorch_trainable |
Trainable PyTorch model | Custom learning workflows and trainable controllers | Yes | No |
ollama_prompt |
Prompt-driven local model | Local reasoning with an Ollama-served model | No | Yes, local |
lmstudio_prompt |
Prompt-driven local model | Local reasoning with LM Studio | No | Yes, local |
cloud_prompt |
Prompt-driven cloud model | Hosted LLM reasoning through a cloud API | No | Yes, remote |
custom_plugin |
User-defined custom provider | Specialized inference logic or custom model wrappers | Depends | Depends |
- Python
>=3.10 pipor Conda for package management- a Windows, Linux, or macOS environment able to access your robot interface
- optional GPU support if you plan to use larger PyTorch models
For a fuller environment checklist and bring-up notes, see docs/software-setup.md.
Option A: venv example
python -m venv .venv\IronEngine-RL
.\.venv\IronEngine-RL\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r requirements.txtOption B: Conda example
conda create -n IronEngine-RL python=3.11 -y
conda activate IronEngine-RL
python -m pip install --upgrade pip
python -m pip install -r requirements.txtpyserialfor serial-connected MCUs or motor controllersopencv-pythonfor USB or onboard camera capturetorchfor custom PyTorch providers and trainable modelsrequestsfor local LLM servers and cloud API integrations
- camera drivers for your selected devices
- serial drivers for USB-to-UART or USB-CAN interfaces
- Ollama or another local model runtime if using local prompt backends
- cloud API credentials in environment variables if using hosted inference
- Create and activate either the
venvor Conda environment shown above. - Install dependencies from
requirements.txt. - Validate the mock ARMSmart profile.
- Run a short mock example episode.
- Inspect the framework description or run the test suite if needed.
python -m ironengine_rl.validate --profile examples\hardware\armsmart\profile.mock.json --strict
python -m ironengine_rl.cli --profile examples\hardware\armsmart\profile.mock.json --episodes 1 --steps 12
python -m ironengine_rl.describe --profile profiles\framework_customizable\profile.json
python -m unittest discover -s tests -p "test_*.py" -vTip: keep the active environment selected before running
python -m ...commands so the correct interpreter and installed dependencies are used.
Tip
If you are brand new to the framework, start with mock hardware first, then inspect the manifest and validation output before trying any custom provider or hardware integration.
What you get from the first four commands
validatechecks schema and compatibility before risky executioncliruns a short episode so you can verify logs and runtime flowdescribeprints framework, platform, and compatibility information for inspectionunittestconfirms the repository examples and framework behaviors still pass regression checks
| Path | Best if you want to... | Start here |
|---|---|---|
Mock bring-up |
verify contracts, safety, and telemetry flow without touching hardware | examples/hardware/armsmart/profile.mock.json |
Custom hardware onboarding |
adapt a new robot, MCU, sensors, and transport interface in a controlled sequence | examples/hardware/custom_robots/template.profile.json |
LLM-guided robotics |
test local or hosted prompt backends on top of a stable hardware contract | examples/inference/armsmart_ollama/profile.json or examples/inference/armsmart_cloud_api/profile.json |
Complete ARMSmart local LLM stack |
try a richer local-LLM workflow with repository context, action-scheme metadata, and a custom task | examples/inference/armsmart_ollama_complete/profile.json |
Complete ARMSmart cloud LLM stack |
try a richer cloud-LLM workflow with repository context, action-scheme metadata, and a custom task | examples/inference/armsmart_cloud_complete/profile.json |
Custom trainable model |
plug in a PyTorch-backed reasoning module or your own provider | examples/inference/armsmart_pytorch_custom/profile.json |
Complete ARMSmart PyTorch stack |
use a custom provider, custom update rules, custom task, custom metric, custom agent, and persistent repository together | examples/inference/armsmart_pytorch_complete/profile.json |
Anomaly customization |
experiment with custom anomaly labels and safety routing before hardware bring-up | examples/plugins/anomaly_customization/profile.json |
Persistent repository and action scheme |
keep the runtime lightweight by default, but try an opt-in repository plugin and explicit action-scheme metadata | examples/plugins/persistent_repository/profile.json |
Scaffold from scratch |
generate a new profile with an explicit action_scheme automatically included |
python -m ironengine_rl.scaffold --output profiles\my_robot\profile.json --guided-goal custom_hardware --name my_robot --guided-backend udp --overwrite |
These quick paths are meant to reduce setup friction: stabilize one layer first, then introduce the next variable.
| If your main question is... | Use this first | Why |
|---|---|---|
| "Does my robot contract even make sense?" | python -m ironengine_rl.validate --profile ... --strict |
validation is the safest first gate |
| "Can I test without moving real hardware?" | examples/hardware/armsmart/profile.mock.json |
mock transport gives safe telemetry and control-loop coverage |
| "How do I integrate my own model?" | examples/inference/armsmart_pytorch_custom/profile.json or examples/inference/armsmart_pytorch_complete/profile.json |
these show custom provider patterns from simple to full-stack |
| "How do I use local or cloud LLMs?" | examples/inference/armsmart_ollama_complete/profile.json or examples/inference/armsmart_cloud_complete/profile.json |
they show prompt-context composition using repository and action-scheme metadata |
| "How do I customize anomaly handling?" | docs/anomaly-detection-and-safety.md and examples/plugins/anomaly_customization/profile.json |
they show how custom anomaly labels can drive warning-only or stop behavior |
| "Where do I start as a contributor?" | docs/developer-guide.md |
it explains source layout, extension points, and test expectations |
flowchart LR
A[Robot or Simulator] --> B[Observations and Camera Features]
B --> C[Inference Provider]
C --> D[Repository Context and Update Strategy]
D --> E[Agent Action Selection]
E --> F[Safety and Boundary Conditions]
F --> G[Transport and Actuation]
G --> H[Telemetry and Evaluation Metrics]
H --> D
This flow shows the core idea behind IronEngine-RL: observations are interpreted by a swappable inference layer, combined with repository context, converted into actions by an agent, constrained by safety boundaries, and then executed through a hardware or simulation transport while evaluation feedback closes the loop.
Read the flow as a practical engineering loop
- sensors and cameras provide structured observations
- a provider turns them into task phase, state estimate, and reward hints
- the repository contributes context, memory, and optional persistence
- the agent chooses commands under the active
action_scheme - the safety layer clamps or replaces unsafe actions before they reach hardware
- evaluation and logs make the next iteration easier to debug and compare
| Mode | Best use | Hardware required | Safety focus | Typical starting point |
|---|---|---|---|---|
| Deterministic simulation | Logic and reward iteration | No | Boundary logic and task flow | Simulation profiles in profiles/ |
| Replay-assisted simulation | Regression checks with recorded observations | No | Contract and perception consistency | Replay-enabled simulation profiles |
| Mock hardware | Transport and telemetry validation | No real robot | Command mapping and telemetry shape | examples/hardware/armsmart/profile.mock.json |
| HIL | Device integration before full deployment | Partial or full hardware | Transport timing, sensors, cameras | examples/hardware/armsmart/profile.hil.json |
| Real hardware | End-to-end deployment | Yes | Full runtime safety, stale observation handling, battery and collision limits | A validated HIL or custom hardware profile |
Important
The recommended progression is still: validation → mock → HIL → real hardware. Even when the model is exciting, the safest workflow is still the fastest overall workflow.
| Workflow | Starting point | Purpose |
|---|---|---|
| ARMSmart mock validation | examples/hardware/armsmart/profile.mock.json |
Safest first validation path for transport, safety, and telemetry |
| ARMSmart HIL | examples/hardware/armsmart/profile.hil.json |
Hardware-in-the-loop ARMSmart setup |
| New robot onboarding | examples/hardware/custom_robots/template.profile.json |
Grouped-hardware template for a new robot |
| Local LLM backend | examples/inference/armsmart_ollama/profile.json |
Local Ollama reasoning |
| Complete local LLM backend | examples/inference/armsmart_ollama_complete/profile.json |
Local LLM planning with repository/database context and explicit action-scheme notes |
| Hosted API backend | examples/inference/armsmart_cloud_api/profile.json |
Hosted API reasoning |
| Complete cloud LLM backend | examples/inference/armsmart_cloud_complete/profile.json |
Cloud LLM planning with repository/database context and explicit action-scheme notes |
| Custom model backend | examples/inference/armsmart_pytorch_custom/profile.json |
Custom PyTorch reasoning |
| Complete PyTorch backend | examples/inference/armsmart_pytorch_complete/profile.json |
Custom PyTorch provider plus custom policy/weight update rules, task, metric, and repository |
| Example family | What it adds beyond the basic examples |
|---|---|
armsmart_pytorch_complete |
custom adaptive provider, reward-aware update rules, custom task, custom metric, action-scheme-aware agent, and persistent repository/database traces |
armsmart_ollama_complete |
richer local LLM prompt context using repository state, database metadata, and explicit action-scheme notes |
armsmart_cloud_complete |
richer cloud LLM prompt context using repository state, success summaries, and explicit action-scheme notes |
| Transport or interface | Best use | Strengths | Design notes |
|---|---|---|---|
mock |
Early validation and regression-friendly bring-up | Fast, safe, reproducible, and easy to inspect in logs | Use this first when changing contracts, agents, or safety rules |
serial |
MCU-connected robots, motor controllers, and embedded bring-up | Simple wiring, broad hardware support, common for UART-based controllers | Define baud rate, packet framing, retries, and disconnect behavior clearly |
udp |
Networked controllers, external compute nodes, or low-latency LAN links | Flexible deployment and easy separation of compute from hardware | Plan for packet loss, heartbeat rules, and explicit timeout handling |
CAN gateway |
Multi-device actuator buses exposed through a bridge | Good fit for structured motor networks and robust field devices | Usually integrated through an adapter layer rather than directly in the agent logic |
The exact transport can vary by robot, but the framework expectation stays the same: commands and telemetry must match the declared contracts, and safety rules must remain enforceable even when the link becomes stale or noisy.
- Validate
examples/hardware/armsmart/profile.mock.json. - Inspect the generated output under
logs/examples/hardware/armsmart/mock. - Use
examples/hardware/armsmart/diagnose_mock.ps1for a quick PowerShell check. - Move to
examples/hardware/armsmart/profile.hil.jsononly after the mock path is stable. - Swap inference backends later through
examples/inference/without changing the hardware contract first. - Use the
*_completeexamples when you want end-to-end customization of action scheme, repository/database storage, evaluation task, metrics, and inference behavior.
The ARMSmart example does not ask users to handcraft raw servo packets in normal operation. Instead, the framework controls the arm through normalized action channels and then translates them into protocol packets for the configured transport.
| Layer | File | What happens |
|---|---|---|
| Action selection | user_modules/examples/agents/armsmart_action_scheme_agent.py |
The agent fills ActionCommand fields such as arm_lift, arm_extend, wrist_yaw, and gripper_close |
| Action model | src/ironengine_rl/interfaces/models.py |
ActionCommand defines the logical control surface used by agents, safety, and adapters |
| Packet encoding | src/ironengine_rl/hardware_adapters/armsmart.py |
encode_action_packets() converts the logical command into motor_set and servo_set protocol frames |
| Protocol execution | examples/hardware/armsmart/profile.mock.json |
The profile defines the protocol command IDs such as servo_set, servo_enable, servo_freeze, and servo_estop |
servo_id 1→arm_liftservo_id 2→arm_extendservo_id 3→wrist_yawservo_id 4→gripper_close
In src/ironengine_rl/hardware_adapters/armsmart.py, _servo_targets() converts each normalized action value into an angle and encode_action_packets() emits one servo_set frame per servo.
- an agent or custom policy produces an
ActionCommand - the hardware adapter maps those fields to servo IDs and target angles
- the selected transport sends the encoded packets
- telemetry comes back as observation fields such as
arm_height,arm_extension, andgripper_close
If you want direct manual control, the clean way is still to set ActionCommand.arm_lift, ActionCommand.arm_extend, ActionCommand.wrist_yaw, and ActionCommand.gripper_close, then let the ARMSmart adapter perform the packet encoding for you.
A customized robot should provide the following at minimum:
| Requirement | What it means in practice |
|---|---|
| Sensors | Telemetry needed by the chosen task, such as battery health, collision risk, arm position, gripper state, target offsets, and optional camera-derived features |
| Robot interface | A stable communication path such as serial, UDP, CAN gateway, or another controller interface that can send commands and return telemetry predictably |
| MCU or controller behavior | Safe startup defaults, repeatable command parsing, telemetry publishing, disconnect handling, and ideally an emergency-stop or passive-stop mode |
| Observation contract | Named sensor fields and camera roles that match the active platform, inference, and evaluation contracts |
| Action contract | Command channels for chassis, arm, wrist, gripper, or other actuators that the agent and safety layers can reason about |
| Timing assumptions | A known update rate, transport timeout, and stale-observation handling rule |
| Workflow discipline | Start with mock transport, validate compatibility, add realistic telemetry, then move to HIL and finally to real hardware |
- Start from
examples/hardware/custom_robots/template.profile.json. - Define the robot contract in
hardware.platform.capabilities. - Configure transport in
hardware.connection, protocol IDs inhardware.protocol, sensors and cameras inhardware.cameras, and limits inhardware.safety. - If needed, adapt your MCU firmware or controller software so telemetry and commands match the declared contract.
- Add custom plugins in
user_modules/only when the built-in modules are not enough. - Run
ironengine_rl.validatebefore any strict or hardware-facing execution. - If you are using the complete PyTorch example, optionally generate a demo weights file with
python examples\inference\armsmart_pytorch_complete\generate_demo_weights.pybefore running the profile.
Why this workflow matters
The fastest way to make robotics integration unsafe is to combine new firmware, new hardware, and new AI behavior in one step. IronEngine-RL is structured so you can validate the contract first, then the transport, then the safety layer, and only afterward the reasoning backend.
Suggested next reads after this README
- read
docs/api-reference.mdif you want the public runtime symbols and extension surfaces - read
docs/developer-guide.mdif you want testing and contribution guidance - read
docs/customization.mdif you want to design your own profiles or plugins - read
docs/examples-and-workflows.mdif you want the shortest path to a runnable example
This project is licensed under the PolyForm Noncommercial License 1.0.0.
See the full text in LICENSE.
See the redistribution notice in NOTICE.
Required Notice: Copyright 2026 DunknowCoding
This license allows noncommercial use, modification, and distribution under the license terms, but it does not allow commercial use.
If you need commercial-use rights, contact DunknowCoding to arrange separate licensing terms for this repository.
If you are interested in robotics experiments, AI-guided hardware integration, and future IronEngine-RL demonstrations, please check out the NiusRobotLab YouTube channel.
If you enjoy the project and want to follow future work, please subscribe to NiusRobotLab.
If you want the most complete customized-model path, start from examples/inference/armsmart_pytorch_complete/profile.json and the companion walkthrough examples/inference/armsmart_pytorch_complete/grasp_process.md.
python -m pip install -r requirements.txt
python -m pip install torch
python examples\inference\armsmart_pytorch_complete\generate_demo_weights.py
python -m ironengine_rl.cli --profile examples\inference\armsmart_pytorch_complete\profile.json --validate-only --strict
python tools\run_armsmart_pytorch_grasp_trial.py- the provider can load
weights_filewhentorchis available - the custom update strategy still changes the effective policy state even with a live PyTorch model
- the agent shapes actions under
armsmart_pick_place_schedule - the repository writes
state_trace,reward_trace, andpolicy_trace - the grasp process can be inspected step by step through
approach,pregrasp, andgrasp_or_lift
- if
torchor the weights file is missing, the complete example still runs by falling back to an analytic policy - generate or place weights before comparing learned behavior across runs
- validate the profile before switching from mock or simulation paths to hardware-facing paths
- inspect the run directory after each trial, especially
summary.json,armsmart_experiment_db.json, andgrasp_trial_report.json
Use the basic custom-provider example when you only need a plugin pattern, and use the complete example when you want verified interaction between provider, update strategy, task, agent, and repository.
When you use an LLM-backed provider, the user should set the mission in the profile through llm.task, and the framework should load SOUL.md for every prompt.
{
"llm": {
"role_contract_file": "SOUL.md",
"task": {
"name": "right_object_grasp",
"goal": "Grasp the right object on the work surface and avoid non-target objects.",
"success_criteria": [
"keep the correct target selected during approach",
"enter pregrasp only when the target is aligned and reachable",
"finish the grasp without violating safety limits"
],
"constraints": [
"respect the action scheme",
"do not bypass the safety controller"
],
"output_requirements": [
"return a framework-compatible control phase",
"stay grounded in visible detections and repository context"
]
}
}
}llm.role_contract_file- usuallySOUL.md, the role contract prepended to each promptllm.task.name- short stable identifier for the missionllm.task.goal- the main user objective in plain languagellm.task.success_criteria- concrete completion checks the model should keep in mindllm.task.constraints- hard behavior limits such as safety and action-scheme compliancellm.task.output_requirements- optional instructions about the expected answer format or decision styleaction_scheme- the allowed control surface and schedule notes the provider should followevaluation.task- the framework task used for evaluation and metrics, separate from the LLM mission text
Use the top-level llm block by default. Only set provider-local overrides when one backend needs a different role contract or a different mission than the rest of the profile.
{
"model_provider": {
"type": "ollama_prompt",
"role_contract_file": "SOUL.md",
"task": {
"name": "cloth_fold_and_place",
"goal": "Fold the cloth neatly and place it into the tray.",
"constraints": [
"respect the active action scheme",
"do not exceed safety limits"
]
}
}
}SOUL.mddefines the LLM role insideIronEngine-RLllm.taskprovides the user mission- the action scheme, repository context, and current observation are added
- the provider asks for the next framework-compatible control phase
- put user intent in
llm.task - keep runtime scoring in
evaluation.task - keep control limits in
action_schemeandsafety - use provider overrides only when one provider genuinely needs different task wording
Use docs/llm-task-and-soul.md for the dedicated workflow and docs/profiles-and-configuration.md for profile editing guidance.
{
"llm": {
"role_contract_file": "SOUL.md",
"task": {
"name": "multi_object_target_grasp",
"goal": "Pick up the red mug on the right side of the table and ignore the blue box and green bottle.",
"success_criteria": [
"keep the red mug selected as the target during approach",
"do not switch to distractor objects when detections fluctuate",
"enter pregrasp only when the target is aligned and reachable",
"finish the grasp without violating safety limits"
],
"constraints": [
"respect the action scheme",
"do not bypass the safety controller",
"use only visible detections and repository context for target selection"
],
"output_requirements": [
"return the next framework-compatible control phase",
"state why the selected target is still the correct object if ambiguity exists"
]
}
},
"evaluation": {
"task": "tabletop_grasp"
}
}{
"llm": {
"role_contract_file": "SOUL.md",
"task": {
"name": "cloth_fold_and_place",
"goal": "Fold the cloth neatly and place it into the right tray.",
"success_criteria": [
"align the cloth before folding",
"complete the fold sequence without unsafe arm motion",
"place the folded cloth into the correct tray"
],
"constraints": [
"respect the action scheme",
"keep motion within safety and reach limits"
],
"output_requirements": [
"return the next task phase in the fold workflow",
"prefer stable staged actions over abrupt motion changes"
]
}
},
"evaluation": {
"task": "tabletop_grasp"
}
}{
"llm": {
"role_contract_file": "SOUL.md",
"task": {
"name": "inspection_route_followup",
"goal": "Inspect checkpoint A, then B, then C, and report any anomaly before continuing.",
"success_criteria": [
"visit checkpoints in order",
"pause and report if an anomaly is detected",
"complete the route without violating navigation or safety limits"
],
"constraints": [
"respect the active action scheme",
"do not skip anomaly handling",
"do not continue to the next checkpoint if safety requires a stop"
],
"output_requirements": [
"return the next framework-compatible control phase",
"keep the checkpoint state consistent with repository memory"
]
}
}
}Use these as starting points, then adapt goal, success_criteria, and constraints to your robot, sensors, and task phases.
The fragment below shows a practical task-oriented profile slice for a multi-object grasping workflow. It puts the user mission, evaluation label, action scheme, safety limits, and provider backend in one place.
{
"model_provider": {
"type": "ollama_prompt",
"model": "qwen3.5:2b",
"base_url": "http://127.0.0.1:11434",
"timeout_s": 20.0
},
"llm": {
"role_contract_file": "SOUL.md",
"task": {
"name": "multi_object_target_grasp",
"goal": "Pick up the red mug on the right side of the table and ignore the blue box and green bottle.",
"success_criteria": [
"keep the red mug selected during approach",
"enter pregrasp only when the target is aligned and reachable",
"finish the grasp without violating safety limits"
],
"constraints": [
"respect the action scheme",
"do not bypass the safety controller",
"use visible detections and repository context for target selection"
],
"output_requirements": [
"return the next framework-compatible control phase",
"ground the decision in current detections"
]
}
},
"evaluation": {
"task": "tabletop_grasp",
"metrics": ["task_performance", "boundary_violations"]
},
"action_scheme": {
"name": "target_first_grasp_schedule",
"schedule_notes": [
"approach before aggressive arm extension",
"prefer target stability over fast target switching",
"only enter grasp when alignment and safety are acceptable"
]
},
"safety": {
"collision_stop_threshold": 0.8,
"low_battery_stop_threshold": 0.15,
"stale_observation_stop_steps": 3
},
"repository": {
"type": "knowledge_repository"
}
}Start from this shape, then replace the task wording, model backend, action-scheme notes, and safety thresholds for your own robot and workspace.
A ready-to-run task-oriented profile now lives at examples/inference/task_oriented_multi_object_grasp/profile.json.
Use tools/run_task_oriented_multi_object_grasp_trial.py when you want a short deterministic simulation run that exercises the task settings without requiring live local-model responses.
The framework already covers the core runtime loop, but the next high-value work should stay explicitly task-oriented so users can map research ideas to runnable robotics workflows.
For the structured version of this list, see docs/roadmap.md.
- add more reusable task blueprints for grasping, sorting, insertion, stacking, docking, inspection, and recovery workflows
- add task templates that separate
goal,success_criteria,constraints,phase_gates, andfailure_recovery - expand
SOUL.mdandllm.taskexamples for multi-stage missions, multi-object manipulation, and human-in-the-loop supervision - add reference task packs for ARMSmart, mobile manipulators, and sensor-rich field robots
- add clearer examples that compare LLM-guided task decomposition against custom PyTorch policy execution on the same task
- add more end-to-end tests for complete example profiles, especially custom-model, anomaly-routing, and repository-backed workflows
- add regression tests for
SOUL.mdloading,llm.taskpropagation, and prompt composition edge cases - add deterministic tests for policy-phase transitions such as
approach,pregrasp,grasp,lift, andplace - add repository assertions for long-run
state_trace,reward_trace,policy_trace, and update-log consistency - add stricter validation coverage for profile path resolution, optional weights files, and custom-plugin contract mismatches
- optimize local-model control loops for shorter prompts, lower latency, and better fallback behavior on small models
- optimize custom PyTorch examples for clearer online adaptation signals and cheaper verification runs
- add profiling helpers for inference latency, transport timing, and repository write overhead
- add caching and batching strategies where they improve repeatability without hiding control decisions
- reduce setup friction for repeated example runs by improving weight generation, reusable run presets, and environment diagnostics
- add a lightweight run visualizer for
summary.json,transitions.jsonl, and repository database files - add policy-phase timeline views for
approach,pregrasp,grasp,grasp_or_lift, and safety overrides - add reward-component plots for progress, alignment, visibility, safety, and success over time
- add camera and detection overlays for replay-based debugging and target-selection analysis
- add comparison dashboards for multiple runs so users can inspect model variants and update-strategy changes side by side
- add richer simulation presets for cluttered scenes, distractor-heavy grasping, partial occlusion, and recovery scenarios
- add more deterministic simulation harnesses for documentation-grade verification of task-oriented workflows
- add replay-assisted simulation tools that combine saved observations, camera frames, and repository state for debugging
- add fault-injection presets for communication drops, battery degradation, sensor drift, and camera failures
- add simulation tools for curriculum-style task progression from simple tabletop grasping to multi-stage manipulation
- add a first-class experiment runner for repeated profile sweeps, ablations, and benchmark summaries
- add better developer tooling around custom modules, including template generators and contract-aware scaffolds
- add optional web-based inspection tools for logs, traces, and simulation outputs
- add stronger documentation for migration paths from mock validation to HIL and real hardware
- add more polished example bundles that include setup, deterministic runner scripts, expected outputs, and troubleshooting notes in one place
This TODO list is intentionally practical: each item should eventually map to a runnable profile, a testable workflow, or an inspectable output artifact.

