Refactor: Metrics in IsaacLabArenaManagerBasedRLEnvCfg by alexmillane · Pull Request #733 · isaac-sim/IsaacLab-Arena

alexmillane · 2026-05-28T13:43:50Z

Summary

Refactors how metrics enter our IsaacLabArenaManagerBasedRLEnvCfg to match Isaac Lab convention.

Detailed description

We were attaching metrics runtime objects to the IsaacLabArenaManagerBasedRLEnvCfg
Isaac Lab expects only configclasses in *Cfg objects.
This changes our design to match more closely with Isaac Lab.
- Metrics now enter the env through MetricsCfgs
- Run-time metrics are then built at env build time by IsaacLabArenaManagerBasedRLEnv
- This matches how things are done in IsaacLab - Cfgs are spcified by the user, and then run-time objects are built by the system based on those configs.

isaaclab-review-bot

Code Review: Metrics Refactor to Isaac Lab Conventions

Summary

This PR refactors the metrics system to align with Isaac Lab conventions, moving from runtime objects attached to the config to a configclass-based approach with a dedicated MetricsManager. This is a well-structured architectural improvement.

✅ What Works Well

Consistent Isaac Lab Pattern: The new MetricsManager mirrors how RewardManager/TerminationManager work - parsing a configclass container and iterating fields. This improves maintainability.
Clean Separation: Extracting module-level compute_* functions from the metric classes enables better testability and follows functional programming principles.
Proper Type Annotations: MetricTermCfg is well-documented with clear docstrings explaining the expected function signature.
Test Updates: Tests are updated to use the new env.unwrapped.compute_metrics() API consistently.

⚠️ Issues & Suggestions

1. Incomplete `compute_metrics` Method (Medium Priority)

File: isaaclab_arena/environments/isaaclab_arena_manager_based_env.py (L113-119)

The compute_metrics() method docstring says it returns metrics, but the method body appears truncated in the diff. Ensure the implementation calls self.metrics_manager.compute():

def compute_metrics(self) -> dict[str, Any]:
    """Compute all registered metrics from the recorded HDF5 dataset."""
    return self.metrics_manager.compute()

2. Mutable Default Argument (Low Priority)

File: isaaclab_arena/metrics/metric_term_cfg.py (L33)

params: dict[str, Any] = dict()

Using dict() as a default is fine in dataclasses with @configclass, but double-check that Isaac Lab's configclass decorator handles this correctly to avoid shared mutable state.

3. Debug Print Statement (Low Priority)

File: isaaclab_arena/environments/isaaclab_arena_manager_based_env.py (L106)

print("[INFO] Metrics Manager: ", self.metrics_manager)

Consider using a proper logger or removing this before merge. Debug prints can clutter production logs.

4. Type Annotation for `metrics` Field (Low Priority)

File: isaaclab_arena/environments/isaaclab_arena_manager_based_env.py (L70)

metrics: object | None = None

Consider a more specific type hint. Even a type alias like MetricsCfg = object with documentation would improve clarity. Alternatively, define a protocol or use Any explicitly.

5. Missing `repr` for MetricsManager (Low Priority)

The print statement in load_managers will show <MetricsManager object at 0x...>. Consider adding a __repr__ method to show active terms:

def __repr__(self) -> str:
    return f"MetricsManager(terms={self.active_terms})"

🔍 Edge Cases to Consider

Empty Metrics Container: If cfg.metrics is an empty configclass (no fields), _prepare_terms will correctly produce an empty manager - this is handled well.
None Term Values: The check if term_cfg is None: continue handles optional metrics gracefully.
Dataset Not Found: If get_metric_recorder_dataset_path() returns a path that doesn't exist, get_recorded_metric_data() will fail. This is pre-existing behavior but worth noting.

📊 Test Coverage

✅ test_revolute_joint_moved_rate_metric.py updated
✅ test_success_rate_metric.py updated
⚠️ No direct unit tests for MetricsManager class itself
⚠️ No test for _metrics_to_metrics_cfg in ArenaEnvBuilder

Consider adding:

Unit test for MetricsManager with mock configclass
Test for edge case where metrics=None (NoTask scenario)

🔧 Unrelated Change

File: isaaclab_arena/examples/compile_env_notebook.py (L34)

-cracker_box.set_initial_pose(Pose(position_xyz=(0.4, 0.0, 0.1), rotation_wxyz=(1.0, 0.0, 0.0, 0.0)))
+cracker_box.set_initial_pose(Pose(position_xyz=(0.4, 0.0, 0.1), rotation_xyzw=(0.0, 0.0, 0.0, 1.0)))

This changes rotation_wxyz to rotation_xyzw - a quaternion convention change. Is this intentional as part of this PR, or should it be a separate commit? Both represent identity rotation, but mixing concerns in one PR can make git bisect harder.

Verdict

Approve with minor suggestions. The architectural change is sound and aligns well with Isaac Lab conventions. The minor issues noted above are non-blocking but should be addressed before or shortly after merge.

Great work on improving the codebase consistency! 🎉

Update (5a89f0c): Reviewed incremental changes from 1d98931 to 5a89f0c.

Changes Reviewed

Debug print removed ✅ - The print("[INFO] Metrics Manager: ...") statement noted in my original review has been removed.
Module split - Configs (IsaacLabArenaManagerBasedRLEnvCfg, IsaacArenaManagerBasedMimicEnvCfg, ArenaPhysicsCfg) moved to new file isaaclab_arena_manager_based_env_cfg.py. This is good separation of concerns - runtime env stays lean.
API rename: func → compute_metric_func - More descriptive naming in MetricTermCfg. All metric classes updated accordingly (success_rate.py, object_moved.py, revolute_joint_moved_rate.py, sequential_task_base.py).
Removed compute_metric_from_recording abstract method - The now-redundant abstract method in MetricBase and its implementations have been removed. The computation is handled via compute_metric_func in the term config.
Simplified MetricsManager._prepare_terms - Removed dict handling branch (configclass only) and explicit type check. Cleaner code, but ⚠️ minor consideration: if a non-MetricTermCfg field exists on the configclass, it will silently fail when accessed. The original type check provided a clearer error message.
Removed unused import - import numpy as np removed from metric_base.py.
Docstrings condensed - Various docstrings trimmed for brevity.

Minor Note

The removal of the explicit MetricTermCfg type check in _prepare_terms means invalid config types will now fail later (e.g., AttributeError: 'str' object has no attribute 'recorder_term_name') rather than with a descriptive TypeError. This is a minor ergonomic tradeoff for cleaner code - acceptable given that configs are typically generated programmatically by _metrics_to_metrics_cfg.

Verdict

All changes look good. The refactoring addressed the debug print issue and further improves the codebase structure. Approve stands. ✅

Update (c723fe0): Reviewed new file isaaclab_arena_manager_based_env_cfg.py. This is the config file mentioned in the previous update - now properly committed. Contains ArenaPhysicsCfg (PhysX/Newton presets), IsaacLabArenaManagerBasedRLEnvCfg, and IsaacArenaManagerBasedMimicEnvCfg. Clean implementation with proper docstrings and appropriate defaults. No issues found. ✅

alexmillane

Self review.

alexmillane · 2026-05-28T13:44:57Z

+        """Build a configclass container with one ``MetricTermCfg`` field per metric.
+
+        Mirrors how Isaac Lab shapes ``cfg.rewards`` / ``cfg.terminations`` so that
+        the runtime :class:`MetricsManager` can iterate the container's fields the
+        same way ``RewardManager`` / ``TerminationManager`` do.
+        """


shorten to single line comment.

alexmillane · 2026-05-28T13:46:10Z

+    # Metrics: a configclass container with one ``MetricTermCfg`` field per metric (or ``None``).
+    # Mirrors how ``rewards`` / ``terminations`` are shaped on ``ManagerBasedRLEnvCfg``.


Remove comment.

alexmillane · 2026-05-28T13:46:53Z

+    """Arena run-time env that adds an offline :class:`MetricsManager` to the base class.
+
+    Mirrors how Isaac Lab attaches a :class:`isaaclab.managers.RewardManager` /
+    :class:`isaaclab.managers.TerminationManager` in :meth:`load_managers`. The metrics
+    manager itself is lightweight and only runs after rollout (see
+    :class:`isaaclab_arena.metrics.metrics_manager.MetricsManager`).
+    """


Shorten comment.

Should say "Arena extension to ManagerBasedRLEnv that adds metrics.

alexmillane · 2026-05-28T13:47:11Z

+        print("[INFO] Metrics Manager: ", self.metrics_manager)
+
+    def compute_metrics(self) -> dict[str, Any]:
+        """Compute all registered metrics from the recorded HDF5 dataset.


Remove "from the recorded HDF5 dataset"

alexmillane · 2026-05-28T13:47:15Z

+    def load_managers(self) -> None:
+        super().load_managers()
+        self.metrics_manager = MetricsManager(self.cfg.metrics, self)
+        print("[INFO] Metrics Manager: ", self.metrics_manager)


alexmillane · 2026-05-28T13:58:15Z

+            if not isinstance(term_cfg, MetricTermCfg):
+                raise TypeError(
+                    f"Configuration for the metric term '{term_name}' is not of type MetricTermCfg."
+                    f" Received: '{type(term_cfg)}'."
+                )


We know this is the case, so we can leave this check out.

alexmillane · 2026-05-28T13:59:36Z

+        """Names of the metric terms registered on this manager."""
+        return list(self._term_names)
+
+    def compute(self, dataset_path: pathlib.Path | None = None) -> dict[str, Any]:


Remove the argument for dataset_path: pathlib.Path | None = None and only rely on the recorder manager.

alexmillane · 2026-05-28T14:00:36Z

-        object_moved_rate = np.mean(object_moved_per_demo)
-        return object_moved_rate
+    def compute_metric_from_recording(self, recorded_metric_data: list[np.ndarray]) -> float:
+        return compute_object_moved_rate(recorded_metric_data, self.object_velocity_threshold)


See comment in the base class, let's get rid of this function: compute_object_moved_rate

alexmillane · 2026-05-28T14:00:50Z

-            revolute_joint_moved_per_demo.append(moved_open or moved_closed)
-        revolute_joint_moved_rate = np.mean(revolute_joint_moved_per_demo)
-        return revolute_joint_moved_rate
+    def compute_metric_from_recording(self, recorded_metric_data: list[np.ndarray]) -> float:


See comment in the base class, let's get rid of this function: compute_metric_from_recording

alexmillane · 2026-05-28T14:02:25Z

+        )

-        return subtask_success_rates.tolist()
+    def compute_metric_from_recording(self, recorded_metric_data: list[np.ndarray]) -> list:


See comment in base class, we can get rid of: compute_metric_from_recording

alexmillane

More

alexmillane · 2026-05-28T14:45:43Z

+        The returned config carries a pointer to the module-level offline compute
+        function (``compute_metric_func``) together with the parameters needed to call
+        it. Used by the runtime ``MetricsManager`` to compute the metric value once a
+        rollout has finished.


Change to:

"The returned config carries a pointer to the metric computation
function together with the parameters needed to call
it. Used by the runtime MetricsManager to compute the metric value once a
rollout has finished."

alexmillane · 2026-05-28T14:47:12Z

+    Mirrors the Isaac Lab manager pattern (:class:`isaaclab.managers.RewardManager`,
+    :class:`isaaclab.managers.TerminationManager`, ...) but is intentionally lightweight:
+    metric computation runs *after* a rollout has finished, so there is no need for
+    per-step hooks or the full :class:`isaaclab.managers.ManagerBase` machinery.


remove:

"Mirrors the Isaac Lab manager pattern (:class:isaaclab.managers.RewardManager,
:class:isaaclab.managers.TerminationManager, ...) but is intentionally lightweight:
metric computation runs after a rollout has finished, so there is no need for
per-step hooks or the full :class:isaaclab.managers.ManagerBase machinery."

greptile-apps · 2026-05-28T14:57:20Z

Greptile Summary

Refactors metrics from runtime objects stored directly on IsaacLabArenaManagerBasedRLEnvCfg to Isaac Lab–style configclass containers, introducing MetricTermCfg, MetricsManager, and IsaacLabArenaManagerBasedRLEnv as the new default entry point.

Cfg / runtime split: the three cfg classes (ArenaPhysicsCfg, IsaacLabArenaManagerBasedRLEnvCfg, IsaacArenaManagerBasedMimicEnvCfg) move to a dedicated _cfg.py file; the _env.py file now holds only the new IsaacLabArenaManagerBasedRLEnv subclass, which wires MetricsManager in load_managers().
Metric compute functions extracted: each metric (SuccessRate, ObjectMovedRate, RevoluteJointMovedRate, SubtaskSuccessRate) now has a standalone module-level compute function; get_metric_term_cfg() replaces compute_metric_from_recording() on every MetricBase subclass, returning a MetricTermCfg that bundles the function pointer + params + recorder name.
Entry point change: _get_env_entry_point now returns IsaacLabArenaManagerBasedRLEnv instead of the base Isaac Lab ManagerBasedRLEnv, ensuring MetricsManager is always initialized via the standard load_managers() hook.

Confidence Score: 4/5

The refactoring is structurally sound and follows Isaac Lab conventions; all call sites are updated consistently and the new MetricsManager initializes cleanly when no metrics are configured.

The _prepare_terms method iterates over cfg.dict while the rest of the codebase uses dataclasses.fields() for configclass traversal — a pattern mismatch that would produce a confusing AttributeError at metric-compute time if any non-declared instance attribute ever appears. The policy_runner metric check also guards on the cfg but not on the presence of compute_metrics, leaving a narrow gap if an env without the method carries a metrics cfg. Neither issue affects the happy path for the current codebase, but both are worth addressing before the code drifts further.

isaaclab_arena/metrics/metrics_manager.py (_prepare_terms field iteration) and isaaclab_arena/evaluation/policy_runner.py (missing compute_metrics guard)

Important Files Changed

Filename	Overview
isaaclab_arena/environments/isaaclab_arena_manager_based_env.py	Slimmed down to only the new IsaacLabArenaManagerBasedRLEnv runtime class; CFG classes moved to _cfg.py. load_managers() hooks MetricsManager, compute_metrics() delegates to it.
isaaclab_arena/environments/isaaclab_arena_manager_based_env_cfg.py	New home for ArenaPhysicsCfg, IsaacLabArenaManagerBasedRLEnvCfg, and IsaacArenaManagerBasedMimicEnvCfg; metrics field typed as object
isaaclab_arena/metrics/metrics_manager.py	New MetricsManager; iterates fields via cfg.dict rather than dataclasses.fields(), which is inconsistent with the rest of the codebase and fragile if configclass ever adds non-declared instance attributes.
isaaclab_arena/metrics/metric_term_cfg.py	New MetricTermCfg configclass; params uses a bare dict() mutable default rather than dataclasses.field(default_factory=dict), relying on @configclass to wrap it.
isaaclab_arena/environments/arena_env_builder.py	Adds _metrics_to_metrics_cfg helper to convert MetricBase list to a dynamic MetricsCfg configclass; default entry point changed from ManagerBasedRLEnv to IsaacLabArenaManagerBasedRLEnv.
isaaclab_arena/evaluation/policy_runner.py	Replaces lazy-imported compute_metrics() free function with env.unwrapped.compute_metrics() method call; check on cfg.metrics presence is retained but no guard for missing compute_metrics method.

Sequence Diagram

sequenceDiagram
    participant Builder as ArenaEnvBuilder
    participant Env as IsaacLabArenaManagerBasedRLEnv
    participant MM as MetricsManager
    participant HDF5 as HDF5 Dataset

    Builder->>Builder: _metrics_to_metrics_cfg(metrics)
    Builder->>Env: "cfg.metrics = MetricsCfg instance"
    Env->>Env: load_managers()
    Env->>MM: MetricsManager(cfg.metrics, self)
    MM->>MM: _prepare_terms(cfg)

    Note over Env,HDF5: At eval / rollout end
    Env->>MM: compute()
    MM->>HDF5: get_metric_recorder_dataset_path
    loop for each MetricTermCfg
        MM->>HDF5: get_recorded_metric_data(recorder_term_name)
        HDF5-->>MM: list[np.ndarray]
        MM->>MM: "compute_metric_func(data, **params)"
    end
    MM->>HDF5: get_num_episodes()
    MM-->>Env: dict[str, Any]

_{Reviews (1): Last reviewed commit: "Moar cleanup." | Re-trigger Greptile}

greptile-apps · 2026-05-28T14:57:24Z

+    def _prepare_terms(self, cfg: object | None) -> None:
+        if cfg is None:
+            return
+        for term_name, term_cfg in cfg.__dict__.items():
+            self._term_names.append(term_name)
+            self._term_cfgs.append(term_cfg)


The rest of the codebase (e.g. combine_configclass_instances, transform_configclass_instance) iterates configclass fields using dataclasses.fields(), not instance.__dict__. Using __dict__ is fragile: any non-declared instance attribute added by a configclass __post_init__ hook or future subclass would be silently treated as a metric term, causing a confusing AttributeError at compute() time when recorder_term_name doesn't exist on the stray value.

Suggested change

def _prepare_terms(self, cfg: object | None) -> None:

if cfg is None:

return

for term_name, term_cfg in cfg.__dict__.items():

self._term_names.append(term_name)

self._term_cfgs.append(term_cfg)

def _prepare_terms(self, cfg: object | None) -> None:

if cfg is None:

return

import dataclasses

for field in dataclasses.fields(cfg):

self._term_names.append(field.name)

self._term_cfgs.append(getattr(cfg, field.name))

greptile-apps · 2026-05-28T14:57:25Z

+    params: dict[str, Any] = dict()
+    """The keyword arguments forwarded to :attr:`compute_metric_func`. Defaults to an empty dict."""


Using a bare dict() call as a field default is unconventional for dataclass-based configs. While Isaac Lab's @configclass wraps mutable defaults in field(default_factory=...) automatically, making this explicit is consistent with standard Python dataclass practice and avoids surprises if the class is ever used outside the configclass machinery.

Suggested change

params: dict[str, Any] = dict()

"""The keyword arguments forwarded to :attr:`compute_metric_func`. Defaults to an empty dict."""

params: dict[str, Any] = dataclasses.field(default_factory=dict)

"""The keyword arguments forwarded to :attr:`compute_metric_func`. Defaults to an empty dict."""

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-05-28T14:57:26Z

                    if hasattr(env.unwrapped.cfg, "metrics") and env.unwrapped.cfg.metrics is not None:
-                        from isaaclab_arena.metrics.metrics import compute_metrics
-
-                        metrics = compute_metrics(env.unwrapped)
+                        metrics = env.unwrapped.compute_metrics()


The guard checks that cfg.metrics is non-None, but compute_metrics() is defined only on IsaacLabArenaManagerBasedRLEnv. If the unwrapped env is a plain ManagerBasedRLEnv carrying a metrics-bearing cfg (e.g. an env registered before this refactor), the call raises AttributeError. Adding a hasattr guard mirrors the defensive intent of the existing cfg check.

Suggested change

if hasattr(env.unwrapped.cfg, "metrics") and env.unwrapped.cfg.metrics is not None:

from isaaclab_arena.metrics.metrics import compute_metrics

metrics = compute_metrics(env.unwrapped)

metrics = env.unwrapped.compute_metrics()

if (

hasattr(env.unwrapped.cfg, "metrics")

and env.unwrapped.cfg.metrics is not None

and hasattr(env.unwrapped, "compute_metrics")

):

metrics = env.unwrapped.compute_metrics()

alexmillane added 2 commits May 28, 2026 11:59

WIP

e014d9c

Fix notebook

1d98931

isaaclab-review-bot Bot reviewed May 28, 2026

View reviewed changes

alexmillane commented May 28, 2026

View reviewed changes

alexmillane added 2 commits May 28, 2026 16:18

Self review.

5a89f0c

Add missing file.

c723fe0

alexmillane commented May 28, 2026

View reviewed changes

Moar cleanup.

c665e20

alexmillane marked this pull request as ready for review May 28, 2026 14:51

alexmillane requested review from cvolkcvolk, peterd-NV, qianl-nv, viiik-inside, xyao-nv and zhx06 as code owners May 28, 2026 14:51

greptile-apps Bot reviewed May 28, 2026

View reviewed changes

		# Metrics: a configclass container with one ``MetricTermCfg`` field per metric (or ``None``).
		# Mirrors how ``rewards`` / ``terminations`` are shaped on ``ManagerBasedRLEnvCfg``.

		params: dict[str, Any] = dict()
		"""The keyword arguments forwarded to :attr:`compute_metric_func`. Defaults to an empty dict."""

Conversation

alexmillane commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Detailed description

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Code Review: Metrics Refactor to Isaac Lab Conventions

Summary

✅ What Works Well

⚠️ Issues & Suggestions

1. Incomplete compute_metrics Method (Medium Priority)

2. Mutable Default Argument (Low Priority)

3. Debug Print Statement (Low Priority)

4. Type Annotation for metrics Field (Low Priority)

5. Missing __repr__ for MetricsManager (Low Priority)

🔍 Edge Cases to Consider

📊 Test Coverage

🔧 Unrelated Change

Verdict

Changes Reviewed

Minor Note

Verdict

Uh oh!

alexmillane left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexmillane left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented May 28, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alexmillane commented May 28, 2026 •

edited

Loading

isaaclab-review-bot Bot left a comment •

edited

Loading

1. Incomplete `compute_metrics` Method (Medium Priority)

4. Type Annotation for `metrics` Field (Low Priority)

5. Missing `repr` for MetricsManager (Low Priority)