Refactor: Metrics in IsaacLabArenaManagerBasedRLEnvCfg#733
Conversation
There was a problem hiding this comment.
Code Review: Metrics Refactor to Isaac Lab Conventions
Summary
This PR refactors the metrics system to align with Isaac Lab conventions, moving from runtime objects attached to the config to a configclass-based approach with a dedicated MetricsManager. This is a well-structured architectural improvement.
✅ What Works Well
-
Consistent Isaac Lab Pattern: The new
MetricsManagermirrors howRewardManager/TerminationManagerwork - parsing a configclass container and iterating fields. This improves maintainability. -
Clean Separation: Extracting module-level
compute_*functions from the metric classes enables better testability and follows functional programming principles. -
Proper Type Annotations:
MetricTermCfgis well-documented with clear docstrings explaining the expected function signature. -
Test Updates: Tests are updated to use the new
env.unwrapped.compute_metrics()API consistently.
⚠️ Issues & Suggestions
1. Incomplete compute_metrics Method (Medium Priority)
File: isaaclab_arena/environments/isaaclab_arena_manager_based_env.py (L113-119)
The compute_metrics() method docstring says it returns metrics, but the method body appears truncated in the diff. Ensure the implementation calls self.metrics_manager.compute():
def compute_metrics(self) -> dict[str, Any]:
"""Compute all registered metrics from the recorded HDF5 dataset."""
return self.metrics_manager.compute()2. Mutable Default Argument (Low Priority)
File: isaaclab_arena/metrics/metric_term_cfg.py (L33)
params: dict[str, Any] = dict()Using dict() as a default is fine in dataclasses with @configclass, but double-check that Isaac Lab's configclass decorator handles this correctly to avoid shared mutable state.
3. Debug Print Statement (Low Priority)
File: isaaclab_arena/environments/isaaclab_arena_manager_based_env.py (L106)
print("[INFO] Metrics Manager: ", self.metrics_manager)Consider using a proper logger or removing this before merge. Debug prints can clutter production logs.
4. Type Annotation for metrics Field (Low Priority)
File: isaaclab_arena/environments/isaaclab_arena_manager_based_env.py (L70)
metrics: object | None = NoneConsider a more specific type hint. Even a type alias like MetricsCfg = object with documentation would improve clarity. Alternatively, define a protocol or use Any explicitly.
5. Missing __repr__ for MetricsManager (Low Priority)
The print statement in load_managers will show <MetricsManager object at 0x...>. Consider adding a __repr__ method to show active terms:
def __repr__(self) -> str:
return f"MetricsManager(terms={self.active_terms})"🔍 Edge Cases to Consider
-
Empty Metrics Container: If
cfg.metricsis an empty configclass (no fields),_prepare_termswill correctly produce an empty manager - this is handled well. -
None Term Values: The check
if term_cfg is None: continuehandles optional metrics gracefully. -
Dataset Not Found: If
get_metric_recorder_dataset_path()returns a path that doesn't exist,get_recorded_metric_data()will fail. This is pre-existing behavior but worth noting.
📊 Test Coverage
- ✅
test_revolute_joint_moved_rate_metric.pyupdated - ✅
test_success_rate_metric.pyupdated ⚠️ No direct unit tests forMetricsManagerclass itself⚠️ No test for_metrics_to_metrics_cfginArenaEnvBuilder
Consider adding:
- Unit test for
MetricsManagerwith mock configclass - Test for edge case where
metrics=None(NoTask scenario)
🔧 Unrelated Change
File: isaaclab_arena/examples/compile_env_notebook.py (L34)
-cracker_box.set_initial_pose(Pose(position_xyz=(0.4, 0.0, 0.1), rotation_wxyz=(1.0, 0.0, 0.0, 0.0)))
+cracker_box.set_initial_pose(Pose(position_xyz=(0.4, 0.0, 0.1), rotation_xyzw=(0.0, 0.0, 0.0, 1.0)))This changes rotation_wxyz to rotation_xyzw - a quaternion convention change. Is this intentional as part of this PR, or should it be a separate commit? Both represent identity rotation, but mixing concerns in one PR can make git bisect harder.
Verdict
Approve with minor suggestions. The architectural change is sound and aligns well with Isaac Lab conventions. The minor issues noted above are non-blocking but should be addressed before or shortly after merge.
Great work on improving the codebase consistency! 🎉
Update (5a89f0c): Reviewed incremental changes from 1d98931 to 5a89f0c.
Changes Reviewed
-
Debug print removed ✅ - The
print("[INFO] Metrics Manager: ...")statement noted in my original review has been removed. -
Module split - Configs (
IsaacLabArenaManagerBasedRLEnvCfg,IsaacArenaManagerBasedMimicEnvCfg,ArenaPhysicsCfg) moved to new fileisaaclab_arena_manager_based_env_cfg.py. This is good separation of concerns - runtime env stays lean. -
API rename:
func→compute_metric_func- More descriptive naming inMetricTermCfg. All metric classes updated accordingly (success_rate.py,object_moved.py,revolute_joint_moved_rate.py,sequential_task_base.py). -
Removed
compute_metric_from_recordingabstract method - The now-redundant abstract method inMetricBaseand its implementations have been removed. The computation is handled viacompute_metric_funcin the term config. -
Simplified
MetricsManager._prepare_terms- Removed dict handling branch (configclass only) and explicit type check. Cleaner code, but⚠️ minor consideration: if a non-MetricTermCfg field exists on the configclass, it will silently fail when accessed. The original type check provided a clearer error message. -
Removed unused import -
import numpy as npremoved frommetric_base.py. -
Docstrings condensed - Various docstrings trimmed for brevity.
Minor Note
The removal of the explicit MetricTermCfg type check in _prepare_terms means invalid config types will now fail later (e.g., AttributeError: 'str' object has no attribute 'recorder_term_name') rather than with a descriptive TypeError. This is a minor ergonomic tradeoff for cleaner code - acceptable given that configs are typically generated programmatically by _metrics_to_metrics_cfg.
Verdict
All changes look good. The refactoring addressed the debug print issue and further improves the codebase structure. Approve stands. ✅
Update (c723fe0): Reviewed new file isaaclab_arena_manager_based_env_cfg.py. This is the config file mentioned in the previous update - now properly committed. Contains ArenaPhysicsCfg (PhysX/Newton presets), IsaacLabArenaManagerBasedRLEnvCfg, and IsaacArenaManagerBasedMimicEnvCfg. Clean implementation with proper docstrings and appropriate defaults. No issues found. ✅
alexmillane
left a comment
There was a problem hiding this comment.
Self review.
| """Build a configclass container with one ``MetricTermCfg`` field per metric. | ||
|
|
||
| Mirrors how Isaac Lab shapes ``cfg.rewards`` / ``cfg.terminations`` so that | ||
| the runtime :class:`MetricsManager` can iterate the container's fields the | ||
| same way ``RewardManager`` / ``TerminationManager`` do. | ||
| """ |
There was a problem hiding this comment.
shorten to single line comment.
| # Metrics: a configclass container with one ``MetricTermCfg`` field per metric (or ``None``). | ||
| # Mirrors how ``rewards`` / ``terminations`` are shaped on ``ManagerBasedRLEnvCfg``. |
There was a problem hiding this comment.
Remove comment.
| """Arena run-time env that adds an offline :class:`MetricsManager` to the base class. | ||
|
|
||
| Mirrors how Isaac Lab attaches a :class:`isaaclab.managers.RewardManager` / | ||
| :class:`isaaclab.managers.TerminationManager` in :meth:`load_managers`. The metrics | ||
| manager itself is lightweight and only runs after rollout (see | ||
| :class:`isaaclab_arena.metrics.metrics_manager.MetricsManager`). | ||
| """ |
There was a problem hiding this comment.
Shorten comment.
Should say "Arena extension to ManagerBasedRLEnv that adds metrics.
| print("[INFO] Metrics Manager: ", self.metrics_manager) | ||
|
|
||
| def compute_metrics(self) -> dict[str, Any]: | ||
| """Compute all registered metrics from the recorded HDF5 dataset. |
There was a problem hiding this comment.
Remove "from the recorded HDF5 dataset"
| def load_managers(self) -> None: | ||
| super().load_managers() | ||
| self.metrics_manager = MetricsManager(self.cfg.metrics, self) | ||
| print("[INFO] Metrics Manager: ", self.metrics_manager) |
| if not isinstance(term_cfg, MetricTermCfg): | ||
| raise TypeError( | ||
| f"Configuration for the metric term '{term_name}' is not of type MetricTermCfg." | ||
| f" Received: '{type(term_cfg)}'." | ||
| ) |
There was a problem hiding this comment.
We know this is the case, so we can leave this check out.
| """Names of the metric terms registered on this manager.""" | ||
| return list(self._term_names) | ||
|
|
||
| def compute(self, dataset_path: pathlib.Path | None = None) -> dict[str, Any]: |
There was a problem hiding this comment.
Remove the argument for dataset_path: pathlib.Path | None = None and only rely on the recorder manager.
| object_moved_rate = np.mean(object_moved_per_demo) | ||
| return object_moved_rate | ||
| def compute_metric_from_recording(self, recorded_metric_data: list[np.ndarray]) -> float: | ||
| return compute_object_moved_rate(recorded_metric_data, self.object_velocity_threshold) |
There was a problem hiding this comment.
See comment in the base class, let's get rid of this function: compute_object_moved_rate
| revolute_joint_moved_per_demo.append(moved_open or moved_closed) | ||
| revolute_joint_moved_rate = np.mean(revolute_joint_moved_per_demo) | ||
| return revolute_joint_moved_rate | ||
| def compute_metric_from_recording(self, recorded_metric_data: list[np.ndarray]) -> float: |
There was a problem hiding this comment.
See comment in the base class, let's get rid of this function: compute_metric_from_recording
| ) | ||
|
|
||
| return subtask_success_rates.tolist() | ||
| def compute_metric_from_recording(self, recorded_metric_data: list[np.ndarray]) -> list: |
There was a problem hiding this comment.
See comment in base class, we can get rid of: compute_metric_from_recording
| The returned config carries a pointer to the module-level offline compute | ||
| function (``compute_metric_func``) together with the parameters needed to call | ||
| it. Used by the runtime ``MetricsManager`` to compute the metric value once a | ||
| rollout has finished. |
There was a problem hiding this comment.
Change to:
"The returned config carries a pointer to the metric computation
function together with the parameters needed to call
it. Used by the runtime MetricsManager to compute the metric value once a
rollout has finished."
| Mirrors the Isaac Lab manager pattern (:class:`isaaclab.managers.RewardManager`, | ||
| :class:`isaaclab.managers.TerminationManager`, ...) but is intentionally lightweight: | ||
| metric computation runs *after* a rollout has finished, so there is no need for | ||
| per-step hooks or the full :class:`isaaclab.managers.ManagerBase` machinery. |
There was a problem hiding this comment.
remove:
"Mirrors the Isaac Lab manager pattern (:class:isaaclab.managers.RewardManager,
:class:isaaclab.managers.TerminationManager, ...) but is intentionally lightweight:
metric computation runs after a rollout has finished, so there is no need for
per-step hooks or the full :class:isaaclab.managers.ManagerBase machinery."
Greptile SummaryRefactors metrics from runtime objects stored directly on
Confidence Score: 4/5The refactoring is structurally sound and follows Isaac Lab conventions; all call sites are updated consistently and the new MetricsManager initializes cleanly when no metrics are configured. The _prepare_terms method iterates over cfg.dict while the rest of the codebase uses dataclasses.fields() for configclass traversal — a pattern mismatch that would produce a confusing AttributeError at metric-compute time if any non-declared instance attribute ever appears. The policy_runner metric check also guards on the cfg but not on the presence of compute_metrics, leaving a narrow gap if an env without the method carries a metrics cfg. Neither issue affects the happy path for the current codebase, but both are worth addressing before the code drifts further. isaaclab_arena/metrics/metrics_manager.py (_prepare_terms field iteration) and isaaclab_arena/evaluation/policy_runner.py (missing compute_metrics guard) Important Files Changed
Sequence DiagramsequenceDiagram
participant Builder as ArenaEnvBuilder
participant Env as IsaacLabArenaManagerBasedRLEnv
participant MM as MetricsManager
participant HDF5 as HDF5 Dataset
Builder->>Builder: _metrics_to_metrics_cfg(metrics)
Builder->>Env: "cfg.metrics = MetricsCfg instance"
Env->>Env: load_managers()
Env->>MM: MetricsManager(cfg.metrics, self)
MM->>MM: _prepare_terms(cfg)
Note over Env,HDF5: At eval / rollout end
Env->>MM: compute()
MM->>HDF5: get_metric_recorder_dataset_path
loop for each MetricTermCfg
MM->>HDF5: get_recorded_metric_data(recorder_term_name)
HDF5-->>MM: list[np.ndarray]
MM->>MM: "compute_metric_func(data, **params)"
end
MM->>HDF5: get_num_episodes()
MM-->>Env: dict[str, Any]
Reviews (1): Last reviewed commit: "Moar cleanup." | Re-trigger Greptile |
| def _prepare_terms(self, cfg: object | None) -> None: | ||
| if cfg is None: | ||
| return | ||
| for term_name, term_cfg in cfg.__dict__.items(): | ||
| self._term_names.append(term_name) | ||
| self._term_cfgs.append(term_cfg) |
There was a problem hiding this comment.
The rest of the codebase (e.g.
combine_configclass_instances, transform_configclass_instance) iterates configclass fields using dataclasses.fields(), not instance.__dict__. Using __dict__ is fragile: any non-declared instance attribute added by a configclass __post_init__ hook or future subclass would be silently treated as a metric term, causing a confusing AttributeError at compute() time when recorder_term_name doesn't exist on the stray value.
| def _prepare_terms(self, cfg: object | None) -> None: | |
| if cfg is None: | |
| return | |
| for term_name, term_cfg in cfg.__dict__.items(): | |
| self._term_names.append(term_name) | |
| self._term_cfgs.append(term_cfg) | |
| def _prepare_terms(self, cfg: object | None) -> None: | |
| if cfg is None: | |
| return | |
| import dataclasses | |
| for field in dataclasses.fields(cfg): | |
| self._term_names.append(field.name) | |
| self._term_cfgs.append(getattr(cfg, field.name)) |
| params: dict[str, Any] = dict() | ||
| """The keyword arguments forwarded to :attr:`compute_metric_func`. Defaults to an empty dict.""" |
There was a problem hiding this comment.
Using a bare
dict() call as a field default is unconventional for dataclass-based configs. While Isaac Lab's @configclass wraps mutable defaults in field(default_factory=...) automatically, making this explicit is consistent with standard Python dataclass practice and avoids surprises if the class is ever used outside the configclass machinery.
| params: dict[str, Any] = dict() | |
| """The keyword arguments forwarded to :attr:`compute_metric_func`. Defaults to an empty dict.""" | |
| params: dict[str, Any] = dataclasses.field(default_factory=dict) | |
| """The keyword arguments forwarded to :attr:`compute_metric_func`. Defaults to an empty dict.""" |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| if hasattr(env.unwrapped.cfg, "metrics") and env.unwrapped.cfg.metrics is not None: | ||
| from isaaclab_arena.metrics.metrics import compute_metrics | ||
|
|
||
| metrics = compute_metrics(env.unwrapped) | ||
| metrics = env.unwrapped.compute_metrics() |
There was a problem hiding this comment.
The guard checks that
cfg.metrics is non-None, but compute_metrics() is defined only on IsaacLabArenaManagerBasedRLEnv. If the unwrapped env is a plain ManagerBasedRLEnv carrying a metrics-bearing cfg (e.g. an env registered before this refactor), the call raises AttributeError. Adding a hasattr guard mirrors the defensive intent of the existing cfg check.
| if hasattr(env.unwrapped.cfg, "metrics") and env.unwrapped.cfg.metrics is not None: | |
| from isaaclab_arena.metrics.metrics import compute_metrics | |
| metrics = compute_metrics(env.unwrapped) | |
| metrics = env.unwrapped.compute_metrics() | |
| if ( | |
| hasattr(env.unwrapped.cfg, "metrics") | |
| and env.unwrapped.cfg.metrics is not None | |
| and hasattr(env.unwrapped, "compute_metrics") | |
| ): | |
| metrics = env.unwrapped.compute_metrics() |
Summary
Refactors how metrics enter our
IsaacLabArenaManagerBasedRLEnvCfgto match Isaac Lab convention.Detailed description
IsaacLabArenaManagerBasedRLEnvCfgconfigclasses in*Cfgobjects.MetricsCfgsIsaacLabArenaManagerBasedRLEnv