MVP-1: per-episode sensitivity recording + NPE analyzer#729
Draft
cvolkcvolk wants to merge 2 commits into
Draft
Conversation
Opt-in writer (--factor_keys + --episode_summary) records the values of the listed arena_env_args keys plus per-episode outcomes (from registered task metrics) to a JSONL during eval_runner. Existing behavior is unchanged when either flag is absent. - Job.arena_env_args_dict preserves the original dict form alongside the existing CLI-args list so the writer can look up factor values by name without re-parsing the args. - The writer's import is deferred inside the per-job try block, matching the policy_runner.py:107 pattern for pxr-touching modules (the writer pulls isaaclab_arena.metrics.metrics, which loads pxr at module top). - Hand-authored factors.yaml + jobs configs check in alongside; --factor_keys on the CLI must match the factors.yaml the analyzer consumes (the analyzer validates the pairing on load). Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Reads paired factors.yaml + episode_summary.jsonl into the (theta, x, prior, factor_columns) quadruple sbi consumes, trains NPE on a chosen outcome, plots the 1D posterior marginal for a continuous factor. CLI driver at isaaclab_arena/scripts/analyze_sensitivity.py. - MVP-1 scope: one continuous 1D factor; categorical and vector (dim > 1) branches raise NotImplementedError so the extension point is reserved. - Runtime [WARN] when fitting on a binary outcome surfaces sbi's 1D-Gaussian fallback caveat: the recovered peak reflects the empirical mean of successful theta values, not the true mode of the success curve. - synthetic_data.py generates a paired JSONL + factors.yaml from a known competence band, letting the analyzer smoke-test end-to-end without sim. - sbi added to DEV_DEPS so the docker dev install picks it up on rebuild. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Per-episode JSONL writer in eval_runner + offline NPE analyzer; MVP-1 of the sensitivity-analysis workstream described in [robolab Analysis Tooling MNPE].
Detailed description
--factor_keys+--episode_summary) ineval_runner; (b)Job.arena_env_args_dictpreserves the original args dict for in-process lookups; (c)isaaclab_arena.analysis.sensitivitypackage withepisode_writer,dataset,analyzer,synthetic_data; (d)isaaclab_arena/scripts/analyze_sensitivity.pyCLI driver; (e)sbiadded to DEV_DEPS so the docker image picks it up on rebuild; (f) hand-authored jobs configs + factors.yaml for thelight_intensityMVP sweep on the droid + pi0 setup.arena_env_args+ outcomes from registered task metrics). Analyzer side is entirely offline.