Skip to content

♻️ Refactor RL actions by SDK#680

Open
flowerthrower wants to merge 51 commits into
mainfrom
668-improve-action-pass-imports-and-wrappers
Open

♻️ Refactor RL actions by SDK#680
flowerthrower wants to merge 51 commits into
mainfrom
668-improve-action-pass-imports-and-wrappers

Conversation

@flowerthrower
Copy link
Copy Markdown
Collaborator

@flowerthrower flowerthrower commented May 12, 2026

Description

Restructures RL actions into SDK-specific modules for Qiskit, TKET, and BQSKit. Also moves the corresponding wrapper logic closer to each SDK action implementation and keeps compatibility exports for existing imports.

This immensely improves modularity and reduces SDK-dependent patches scattered across RL modules.
It also enables much cleaner and proper testing of pass invariants.

Fixes #668

Merging blocked by (should be merged first):

Assisted by: GPT5.5 via Codex

Checklist

  • The pull request only contains commits that are focused and relevant to this change.
  • I have added appropriate tests that cover the new/changed functionality.
  • I have updated the documentation to reflect these changes.
  • I have added entries to the changelog for any noteworthy additions, changes, fixes, or removals.
  • I have added migration instructions to the upgrade guide (if needed).
  • The changes follow the project's style guidelines and introduce no new warnings.
  • The changes are fully tested and pass the CI checks.
  • I have reviewed my own code changes.

If PR contains AI-assisted content:

  • I have disclosed the use of AI tools in the PR description as per our AI Usage Guidelines.
  • AI-assisted commits include an Assisted-by: [Model Name] via [Tool Name] footer.
  • I confirm that I have personally reviewed and understood all AI-generated content, and accept full responsibility for it.

flowerthrower and others added 17 commits March 11, 2026 12:24
## Description

This PR addresses critical bugs in the RL training process with the
following key changes:

**Structure Improvements:**
- **Redesigned action validation logic** (`predictorenv.py`): Rewrote
`determine_valid_actions_for_state()` with a more structured (but
equivalent) state machine that explicitly tracks three circuit states
(synthesized, laid_out, routed) and handles 6 different state
combinations.
- Added helper methods `is_circuit_laid_out()` and `is_circuit_routed()`
to replace the buggy `CheckMap` pass with more reliable state checking.
The new logic supports both the original restricted MDP and a flexible
general MDP mode.

- **Fixed type annotation** (`actions.py`): Corrected `do_while`
parameter type from `dict[str, Circuit]` to `PropertySet` and added
missing import for Qiskit's `PropertySet`.

- **Added reproducibility** (`predictor.py`): Set random seed for
non-test training runs to ensure reproducible results.

- **Improved VF2Layout error handling** (`predictorenv.py`): Replaced
assertion failures with warning logs when VF2Layout doesn't find a
solution, preventing crashes during training.

**Test Updates:**
- Suppressed deprecation warnings in tket routing test

---------

Signed-off-by: Patrick Hopf <81010725+flowerthrower@users.noreply.github.com>
Co-authored-by: flowerthrower <flowerthrower@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@flowerthrower flowerthrower linked an issue May 12, 2026 that may be closed by this pull request
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 94.68085% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/mqt/predictor/rl/predictorenv.py 83.7% 6 Missing ⚠️
src/mqt/predictor/rl/actions/qiskit_actions.py 93.1% 5 Missing ⚠️
src/mqt/predictor/rl/actions/bqskit_actions.py 93.1% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@flowerthrower flowerthrower added the enhancement New feature or request label Jun 2, 2026
@flowerthrower
Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 5, 2026

✅ Action performed

Full review finished.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 5, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • Refactor

    • Restructured RL compilation action system from monolithic file to modular SDK-specific package with separate handlers for Qiskit, TKET, and BQSKit.
    • Enhanced environment with centralized action dispatching and improved error handling for failed compilation steps.
  • Tests

    • Updated integration tests to verify action behavior across compilation stages.
  • Chores

    • Updated .gitignore to exclude RL training artifacts.

Walkthrough

This PR restructures the RL action registry from a flat module into a modular package with SDK-specific implementations. The old actions.py is replaced by an actions/ package where each compiler SDK (Qiskit, TKET, BQSKit) provides its own action factories and execution helpers. The environment is refactored to delegate action application through these SDK helpers instead of inline code.

Changes

RL Actions Architecture Refactor

Layer / File(s) Summary
Action contract and registry
src/mqt/predictor/rl/actions/__init__.py
Defines Action dataclass variants (DeviceIndependentAction, DeviceDependentAction), PassType and CompilationOrigin enums, and a global registry with register_action, remove_action, get_actions_by_pass_type functions. On module load, imports and registers all SDK action factories and adds a built-in terminate action.
Qiskit SDK action implementation
src/mqt/predictor/rl/actions/qiskit_actions.py
Implements optimization, layout, mapping, and synthesis actions as device-dependent and device-independent Action objects. Includes run_qiskit_action dispatcher, _postprocess_layout_action for VF2/VF2PostLayout metadata handling, and availability check that restricts VF2PostLayout to IBM devices.
TKET SDK action implementation
src/mqt/predictor/rl/actions/tket_actions.py
Implements optimization and routing actions; adds PreProcessTKETRoutingAfterQiskitLayout to reuse prior Qiskit layout. Includes run_tket_action dispatcher, final_layout_pytket_to_qiskit layout conversion, and availability check that gates optimization actions after layout.
BQSKit SDK action implementation
src/mqt/predictor/rl/actions/bqskit_actions.py
Implements optimization, synthesis, and mapping actions with BQSKit↔Qiskit circuit conversion via OpenQASM/U1q rewriting. Includes get_bqskit_native_gates caching, final_layout_bqskit_to_qiskit conversion with ancilla handling, run_bqskit_action dispatcher, and availability check blocking use with parameterized gates or existing layout.
PredictorEnv delegation to SDK helpers
src/mqt/predictor/rl/predictorenv.py
Refactors apply_action to dispatch by CompilationOrigin through run_qiskit_action, run_tket_action, run_bqskit_action; removes inline _apply_* implementations. Rebuilds action_masks using SDK availability functions. Wraps action application in exception handling for error truncation. Updates step to return truncated episodes on action failure.
Integration test refactor
tests/compilation/test_integration_further_SDKs.py
Replaces SDK-specific action tests with invariant-based tests using Qiskit circuit fixtures. Adds _setup_env and _is_available helpers, staged circuit fixtures (laid out, routed, synthesized), and four integration tests validating synthesis/layout/routing/optimization actions against preservation contracts.
Unit test updates
tests/compilation/test_helper_rl.py, tests/compilation/test_predictor_rl.py
Updates imports to reference new action module structure. Replaces direct _apply_qiskit_action calls with env.apply_action public interface.
Documentation and config
.gitignore, CHANGELOG.md
Adds RL training artifact pattern (events.out.*) to ignore list; documents action restructuring in Unreleased/Changed section.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

refactor

Suggested reviewers

  • burgholzer

Poem

🐰 Actions once scattered, now neatly arranged,
Qiskit, TKET, BQSKit—all exchanged!
SDK helpers handle dispatch with care,
While PredictorEnv breathes fresher air!
Modular passes, no more monolith's weight—
Quantum circuits route at quantum-gate rate!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change—refactoring RL actions to be organized by SDK—and is concise and specific.
Description check ✅ Passed The PR description explains the restructuring, objectives, and includes the required checklist; however, most checklist items remain unchecked, indicating incomplete follow-through on documentation and testing requirements.
Linked Issues check ✅ Passed The PR addresses issue #668 by restructuring RL actions into SDK-specific modules (Qiskit, TKET, BQSKit) with wrapper logic consolidated, achieving the stated objective of improving modularity and reducing scattered SDK patches.
Out of Scope Changes check ✅ Passed All changes are scoped to RL action restructuring and refactoring: new SDK-specific modules, consolidation of wrapper logic, updated imports, and refactored tests—no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 98.08% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch 668-improve-action-pass-imports-and-wrappers

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Line 14: Update the changelog entry that currently reads "♻️ Restructure
existing RL passes into SDK-level action modules ([`#644`])" to reference the
correct PR number ([`#680`]) and add a corresponding link definition for [`#680`] in
the PR links section so the markdown link resolves; modify the entry text where
the phrase "Restructure existing RL passes into SDK-level action modules"
appears and add the matching link definition in the PR links block (the section
that contains other bracketed PR references).

In `@src/mqt/predictor/rl/actions/__init__.py`:
- Around line 86-95: The docstring for remove_action promises ValueError on
missing names but the function currently raises KeyError; change the raised
exception to ValueError (e.g., replace raise KeyError(msg) with raise
ValueError(msg)) so the implementation matches the documented API, keeping the
existing message and using the same _ACTIONS lookup logic in remove_action to
locate the correct place to modify.
- Around line 44-68: The public dataclasses Action and DeviceDependentAction are
missing Google-style docstrings for their public fields; update the docstrings
on both classes (Action, DeviceIndependentAction, DeviceDependentAction) to use
Google-style format and add an "Attributes:" section that documents each exposed
field (for Action: name, origin, pass_type, transpile_pass, preserves_layout,
preserves_routing, preserves_synthesis; for DeviceDependentAction also document
transpile_pass override and do_while) with a brief one-line description and type
for each attribute, keeping descriptions concise and matching the existing field
names and types (e.g., Callable[[PropertySet], bool] | None for do_while).
- Around line 73-103: The three functions register_action, remove_action, and
get_actions_by_pass_type lack Google-style docstring sections; update each
function's docstring to include "Args:" documenting parameters (e.g., action:
Action for register_action, name: str for remove_action), "Returns:" documenting
the return types (Action for register_action, None for remove_action,
dict[PassType, list[Action]] for get_actions_by_pass_type), and keep the
existing "Raises:" entries (ValueError/KeyError) as appropriate; reference the
Action and PassType types and the global _ACTIONS registry in the descriptions
to make the purpose clear.

In `@src/mqt/predictor/rl/actions/bqskit_actions.py`:
- Around line 135-145: Update the docstrings to Google style for
get_bqskit_native_gates and final_layout_bqskit_to_qiskit: replace "Arguments:"
with "Args:", add structured "Returns:" and "Raises:" sections (and for
final_layout_bqskit_to_qiskit also add an "Args:" describing the input
types/semantics and a "Returns:" describing the returned layout mapping and its
format), and ensure types (e.g., device: Target, return: list[Gate] or mapping
types) and any error conditions are documented exactly as in the functions
get_bqskit_native_gates and final_layout_bqskit_to_qiskit so they conform to the
repository Google-style docstring guidelines.

In `@src/mqt/predictor/rl/actions/qiskit_actions.py`:
- Around line 326-350: The VF2PostLayout branch currently discards the updated
ApplyLayout property set returned by postprocess_vf2postlayout and rebuilds
TranspileLayout from stale values; instead, after calling
postprocess_vf2postlayout(altered_qc, post_layout, layout) and updating
altered_qc, use the updated entries in property_set (e.g.
property_set["layout"], property_set["original_qubit_indices"],
property_set["final_layout"]) when constructing and returning TranspileLayout so
the returned layout metadata matches the rewritten circuit; keep using the
altered_qc and _input_qubit_count as before.

In `@src/mqt/predictor/rl/actions/tket_actions.py`:
- Around line 150-162: The runtime contract mismatch: is_tket_action_available
currently allows PassType.ROUTING when has_layout is false but run_tket_action
asserts layout is not None before writing final_layout; fix by making
run_tket_action tolerant of a None layout for routing: when action.pass_type ==
PassType.ROUTING and layout is None, construct a new TranspileLayout (or
equivalent empty layout object used elsewhere, e.g. the same type expected by
final_layout_pytket_to_qiskit) before assigning layout.final_layout =
final_layout_pytket_to_qiskit(tket_qc, altered_qc); keep the existing behavior
if layout is provided. Ensure you touch run_tket_action (the layout handling
path) and do not change is_tket_action_available unless you prefer the
alternative of enforcing has_layout for ROUTING.

In `@src/mqt/predictor/rl/predictorenv.py`:
- Around line 172-183: The broad `except Exception:  # noqa: BLE001` around the
apply_action call masks unexpected bugs; either narrow it to the specific error
classes `apply_action` can raise (e.g., ValueError, RuntimeError, ActionError)
and handle those, or keep a single broad handler but add a short rationale
comment explaining why catching all exceptions is necessary (e.g., to truncate
the environment on any action failure) and re-raise or log truly unexpected
exceptions if needed; update the block around self.used_actions.append /
altered_qc = self.apply_action(action) and set self.error_occurred as before,
but prefer enumerating concrete exception types or justify the BLE001
suppression in a comment next to the except line referencing apply_action and
error handling policy.
- Around line 325-360: apply_action currently only validates that action_index
exists in self.action_set, allowing masked-out actions to be executed; call the
environment's action mask (e.g., self.action_masks()) at the start of
apply_action, verify that the bit/entry for action_index is True/allowed, and if
not raise an informative ValueError (include action_index and action.origin);
apply this check before handling PassType.TERMINATE and before calling
run_qiskit_action/run_tket_action/run_bqskit_action so masked TERM and SDK
actions are rejected rather than executed.
- Around line 313-324: Update the docstring for the method apply_action to use
Google-style section headers: replace "Arguments:" with "Args:", keep "Returns:"
and "Raises:" as-is but ensure formatting matches Google style (indent param
lines under Args:, describe return under Returns:, and exceptions under
Raises:). Ensure the parameter name action_index, its type and description, the
returned QuantumCircuit, and the ValueError explanation appear under the correct
Google-style headers in the apply_action docstring.
- Around line 140-143: reset() currently clears self.layout causing
determine_valid_actions_for_state() to miscompute laid_out and step() allows
actions without checking valid masks; update reset() to preserve an incoming
circuit layout when present (do not unconditionally set self.layout = None) and
ensure determine_valid_actions_for_state() uses self.layout correctly to compute
laid_out; in step(), validate the requested action against self.valid_actions or
action_masks() before calling apply_action() (disallow TERMINATE or SDK actions
when not in self.valid_actions); narrow the broad "except Exception" to a
specific exception class (or add a comment explaining why a catch-all is
required) and update apply_action() docstring heading from "Arguments:" to
Google-style "Args:" so docs are consistent.

In `@tests/compilation/test_integration_further_SDKs.py`:
- Line 107: Replace the private access layout_before._input_qubit_count by
deriving the count from public metadata (e.g. compute it as
len(layout_before.initial_layout) or another public property that lists the
input qubits) and remove the "# noqa: SLF001" suppression; update the call site
that used _input_qubit_count to use the computed integer instead (reference
symbols: layout_before and _input_qubit_count).
- Around line 234-243: The preserves_layout check can be bypassed when
env.layout is None; change the conditional so the test fails if layout was
dropped: assert env.layout is not None, (f"{action.name} on
{env.device.description} VIOLATED INVARIANT preserves_layout: layout metadata
was removed"), then compute post_v2p =
dict(env.layout.initial_layout.get_virtual_bits()) and assert post_v2p ==
pre_v2p with the existing descriptive message; keep the existing
is_circuit_laid_out(compiled, layout) assertion but replace the optional if
block around env.layout with an explicit assertion that env.layout remains
present before comparing mappings (references: env.layout,
initial_layout.get_virtual_bits(), pre_v2p, preserves_layout).
- Around line 36-45: Both helper functions use short one-line docstrings—replace
them with Google-style docstrings: for _setup_env(PredictorEnv env,
QuantumCircuit circuit, TranspileLayout|None layout, int n_qubits) include an
Args section describing each parameter and a Returns section noting None and the
side-effect of resetting the env; for _is_available(PredictorEnv env, int idx)
include an Args section for env and idx and a Returns section describing that it
returns bool indicating whether the action is structurally and SDK-valid and
that it updates env.valid_actions via determine_valid_actions_for_state. Ensure
parameter types match the signatures and keep descriptions concise.

In `@tests/compilation/test_predictor_rl.py`:
- Around line 182-183: The test currently mutates env.action_set[0] which is
safe because PredictorEnv builds self.action_set per-instance, but it’s brittle
to hardcode index 0; change the test to pick a valid routing index from the
environment (e.g., use an index from env.actions_routing_indices or another
existing key in env.action_set) and assign the action to that index instead of 0
so the test does not depend on internal action ordering in PredictorEnv.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a836606e-13dc-4e30-9f71-5b2118832d30

📥 Commits

Reviewing files that changed from the base of the PR and between 10b56d0 and 3ec22c7.

📒 Files selected for processing (12)
  • .gitignore
  • CHANGELOG.md
  • src/mqt/predictor/rl/actions.py
  • src/mqt/predictor/rl/actions/__init__.py
  • src/mqt/predictor/rl/actions/bqskit_actions.py
  • src/mqt/predictor/rl/actions/qiskit_actions.py
  • src/mqt/predictor/rl/actions/tket_actions.py
  • src/mqt/predictor/rl/parsing.py
  • src/mqt/predictor/rl/predictorenv.py
  • tests/compilation/test_helper_rl.py
  • tests/compilation/test_integration_further_SDKs.py
  • tests/compilation/test_predictor_rl.py
💤 Files with no reviewable changes (2)
  • src/mqt/predictor/rl/actions.py
  • src/mqt/predictor/rl/parsing.py

Comment thread CHANGELOG.md Outdated

### Changed

- ♻️ Restructure existing RL passes into SDK-level action modules ([#644]) ([**@flowerthrower**])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct the PR reference to match the current PR.

This changelog entry references [#644], but the current PR is #680. The entry should reference the PR that is actually merging these changes. Additionally, there is no link definition for [#644] in the PR links section (lines 50-64), which would result in a broken markdown link.

📝 Proposed fix

Update line 14 to reference the current PR and add the link definition:

-- ♻️ Restructure existing RL passes into SDK-level action modules ([`#644`]) ([**`@flowerthrower`**])
+- ♻️ Restructure existing RL passes into SDK-level action modules ([`#680`]) ([**`@flowerthrower`**])

Then add the link definition in the PR links section after line 51:

+[`#680`]: https://github.com/munich-quantum-toolkit/predictor/pull/680
 [`#677`]: https://github.com/munich-quantum-toolkit/predictor/pull/677
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` at line 14, Update the changelog entry that currently reads "♻️
Restructure existing RL passes into SDK-level action modules ([`#644`])" to
reference the correct PR number ([`#680`]) and add a corresponding link definition
for [`#680`] in the PR links section so the markdown link resolves; modify the
entry text where the phrase "Restructure existing RL passes into SDK-level
action modules" appears and add the matching link definition in the PR links
block (the section that contains other bracketed PR references).

Comment thread src/mqt/predictor/rl/actions/__init__.py
Comment on lines +73 to +103
def register_action(action: Action) -> Action:
"""Registers a new action in the global actions registry.

Raises:
ValueError: If an action with the same name is already registered.
"""
if action.name in _ACTIONS:
msg = f"Action with name {action.name} already registered."
raise ValueError(msg)
_ACTIONS[action.name] = action
return action


def remove_action(name: str) -> None:
"""Removes an action from the global actions registry by name.

Raises:
ValueError: If no action with the given name is registered.
"""
if name not in _ACTIONS:
msg = f"No action with name {name} is registered."
raise KeyError(msg)
del _ACTIONS[name]


def get_actions_by_pass_type() -> dict[PassType, list[Action]]:
"""Returns a dictionary mapping each PassType to a list of Actions of that type."""
result: dict[PassType, list[Action]] = defaultdict(list)
for action in _ACTIONS.values():
result[action.pass_type].append(action)
return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Function docstrings should include Google-style Args/Returns sections.

register_action, remove_action, and get_actions_by_pass_type currently omit required Google-style parameter/return documentation.

Suggested fix
 def register_action(action: Action) -> Action:
     """Registers a new action in the global actions registry.
 
+    Args:
+        action: Action to register.
+
+    Returns:
+        The registered action.
+
     Raises:
         ValueError: If an action with the same name is already registered.
     """
@@
 def remove_action(name: str) -> None:
     """Removes an action from the global actions registry by name.
 
+    Args:
+        name: Name of the action to remove.
+
     Raises:
         ValueError: If no action with the given name is registered.
     """
@@
 def get_actions_by_pass_type() -> dict[PassType, list[Action]]:
-    """Returns a dictionary mapping each PassType to a list of Actions of that type."""
+    """Groups actions by pass type.
+
+    Returns:
+        Mapping from pass type to registered actions of that type.
+    """

As per coding guidelines: **/*.py: MUST use Google-style docstrings in Python code.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/mqt/predictor/rl/actions/__init__.py` around lines 73 - 103, The three
functions register_action, remove_action, and get_actions_by_pass_type lack
Google-style docstring sections; update each function's docstring to include
"Args:" documenting parameters (e.g., action: Action for register_action, name:
str for remove_action), "Returns:" documenting the return types (Action for
register_action, None for remove_action, dict[PassType, list[Action]] for
get_actions_by_pass_type), and keep the existing "Raises:" entries
(ValueError/KeyError) as appropriate; reference the Action and PassType types
and the global _ACTIONS registry in the descriptions to make the purpose clear.

Comment thread src/mqt/predictor/rl/actions/__init__.py
Comment on lines +135 to +145
def get_bqskit_native_gates(device: Target) -> list[Gate]:
"""Returns the native gates of the given device.

Arguments:
device: The device for which the native gates are returned.

Returns:
The native gates of the given device as BQSKit gates.

Raises:
ValueError: If a gate in the device is not supported in BQSKit.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Normalize these docstrings to the repo’s Google style.

get_bqskit_native_gates() uses Arguments: instead of Args:, and final_layout_bqskit_to_qiskit() is missing structured argument/return sections even though it carries a non-trivial layout-conversion contract.

As per coding guidelines, **/*.py: MUST use Google-style docstrings in Python code.

Also applies to: 220-231

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/mqt/predictor/rl/actions/bqskit_actions.py` around lines 135 - 145,
Update the docstrings to Google style for get_bqskit_native_gates and
final_layout_bqskit_to_qiskit: replace "Arguments:" with "Args:", add structured
"Returns:" and "Raises:" sections (and for final_layout_bqskit_to_qiskit also
add an "Args:" describing the input types/semantics and a "Returns:" describing
the returned layout mapping and its format), and ensure types (e.g., device:
Target, return: list[Gate] or mapping types) and any error conditions are
documented exactly as in the functions get_bqskit_native_gates and
final_layout_bqskit_to_qiskit so they conform to the repository Google-style
docstring guidelines.

Comment on lines 325 to +360
if action_index not in self.action_set:
msg = f"Action {action_index} not supported."
raise ValueError(msg)

action = self.action_set[action_index]

if action.name == "terminate":
if action.pass_type == PassType.TERMINATE:
return self.state

if action.origin == CompilationOrigin.QISKIT:
return self._apply_qiskit_action(action, action_index)
if action.origin == CompilationOrigin.TKET:
return self._apply_tket_action(action, action_index)
if action.origin == CompilationOrigin.BQSKIT:
return self._apply_bqskit_action(action, action_index)
msg = f"Origin {action.origin} not supported."
raise ValueError(msg)

def _apply_qiskit_action(self, action: Action, action_index: int) -> QuantumCircuit:
if action.name == "QiskitO3" and isinstance(action, DeviceDependentAction):
factory = cast("Callable[[list[str], CouplingMap | None], list[Task]]", action.transpile_pass)
passes = factory(
self.device.operation_names,
CouplingMap(self.device.build_coupling_map()) if self.layout else None,
altered_qc, self.layout = run_qiskit_action(
action=action,
circuit=self.state,
device=self.device,
layout=self.layout,
input_qubit_count=self.num_qubits_uncompiled_circuit,
)
assert action.do_while is not None
pm = PassManager([DoWhileController(passes, do_while=action.do_while)])
else:
if callable(action.transpile_pass):
factory = cast("Callable[[Target], list[Task]]", action.transpile_pass)
passes = factory(self.device)
else:
passes = cast("list[Task]", action.transpile_pass)
pm = PassManager(passes)

altered_qc = pm.run(self.state)

if action_index in (
self.actions_layout_indices + self.actions_mapping_indices + self.actions_final_optimization_indices
):
altered_qc = self._handle_qiskit_layout_postprocessing(action, pm, altered_qc)

elif (
action_index in self.actions_routing_indices and self.layout and pm.property_set["final_layout"] is not None
):
self.layout.final_layout = pm.property_set["final_layout"]

# BasisTranslator errors on unitary gates; decompose them immediately so
# the circuit is always in a consistent state after a Qiskit action.
if altered_qc.count_ops().get("unitary"):
altered_qc = altered_qc.decompose(gates_to_decompose="unitary")

return altered_qc

def _handle_qiskit_layout_postprocessing(
self, action: Action, pm: PassManager, altered_qc: QuantumCircuit
) -> QuantumCircuit:
if action.name == "VF2PostLayout":
assert pm.property_set["VF2PostLayout_stop_reason"] is not None
post_layout = pm.property_set["post_layout"]
if post_layout:
assert self.layout is not None
altered_qc, _ = postprocess_vf2postlayout(altered_qc, post_layout, self.layout)
elif action.name == "VF2Layout":
if pm.property_set["VF2Layout_stop_reason"] != VF2LayoutStopReason.SOLUTION_FOUND:
logger.warning(
"VF2Layout pass did not find a solution. Reason: %s",
pm.property_set["VF2Layout_stop_reason"],
)
else:
assert pm.property_set["layout"]
else:
assert pm.property_set["layout"]

if pm.property_set["layout"]:
# Layout/mapping passes create the base logical-to-physical mapping;
# later routing actions only update final_layout.
self.layout = TranspileLayout(
initial_layout=pm.property_set["layout"],
input_qubit_mapping=pm.property_set["original_qubit_indices"],
final_layout=pm.property_set["final_layout"],
_output_qubit_list=altered_qc.qubits,
_input_qubit_count=self.num_qubits_uncompiled_circuit,
elif action.origin == CompilationOrigin.TKET:
altered_qc, self.layout = run_tket_action(
action=action,
circuit=self.state,
device=self.device,
layout=self.layout,
)
return altered_qc

def _apply_tket_action(self, action: Action, action_index: int) -> QuantumCircuit:
tket_qc = qiskit_to_tk(self.state, preserve_param_uuid=True)
if callable(action.transpile_pass):
factory = cast("Callable[[Target], list[Task]]", action.transpile_pass)
passes = factory(self.device)
else:
passes = cast("list[Task]", action.transpile_pass)
for pass_ in passes:
assert isinstance(pass_, TketBasePass | PreProcessTKETRoutingAfterQiskitLayout)
pass_.apply(tket_qc)

qbs = tket_qc.qubits
tket_qc.rename_units({qbs[i]: Qubit("q", i) for i in range(len(qbs))})
altered_qc = tk_to_qiskit(tket_qc, replace_implicit_swaps=True)

if action_index in self.actions_routing_indices:
assert self.layout is not None
self.layout.final_layout = final_layout_pytket_to_qiskit(tket_qc, altered_qc)

return altered_qc

def _apply_bqskit_action(self, action: Action, action_index: int) -> QuantumCircuit:
"""Applies the given BQSKit action to the current state and returns the altered state.

Arguments:
action: The BQSKit action to be applied.
action_index: The index of the action in the action set.

Returns:
The altered quantum circuit after applying the action.

Raises:
ValueError: If the action index is not in the action set or if the action origin is not supported.
"""
bqskit_qc = qiskit_to_bqskit(self.state)
if action_index in self.actions_opt_indices:
transpile = cast("Callable[[Circuit], Circuit]", action.transpile_pass)
bqskit_compiled_qc = transpile(bqskit_qc)
elif action_index in self.actions_synthesis_indices:
factory = cast("Callable[[Target], Callable[[Circuit], Circuit]]", action.transpile_pass)
bqskit_compiled_qc = factory(self.device)(bqskit_qc)
elif action_index in self.actions_mapping_indices:
factory = cast(
"Callable[[Target], Callable[[Circuit], tuple[Circuit, tuple[int, ...], tuple[int, ...]]]]",
action.transpile_pass,
elif action.origin == CompilationOrigin.BQSKIT:
altered_qc, self.layout = run_bqskit_action(
action=action,
circuit=self.state,
device=self.device,
layout=self.layout,
)
bqskit_compiled_qc, initial, final = factory(self.device)(bqskit_qc)
compiled_qiskit_qc = bqskit_to_qiskit(bqskit_compiled_qc)
self.layout = final_layout_bqskit_to_qiskit(initial, final, compiled_qiskit_qc, self.state)
return compiled_qiskit_qc
else:
msg = f"Unhandled BQSKit action index: {action_index}"
msg = f"Origin {action.origin} not supported."
raise ValueError(msg)

return bqskit_to_qiskit(bqskit_compiled_qc)
return altered_qc
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject actions that are masked out for the current state.

apply_action() now only checks that the index exists. That lets callers bypass action_masks() entirely: TERMINATE succeeds from the initial state, and SDK actions filtered out by availability are still dispatched until a helper fails. Enforce the current mask here before returning self.state or calling any run_*_action helper.

Suggested fix
         if action_index not in self.action_set:
             msg = f"Action {action_index} not supported."
             raise ValueError(msg)
+        if not self.action_masks()[action_index]:
+            msg = f"Action {action_index} is not valid for the current circuit state."
+            raise ValueError(msg)
 
         action = self.action_set[action_index]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/mqt/predictor/rl/predictorenv.py` around lines 325 - 360, apply_action
currently only validates that action_index exists in self.action_set, allowing
masked-out actions to be executed; call the environment's action mask (e.g.,
self.action_masks()) at the start of apply_action, verify that the bit/entry for
action_index is True/allowed, and if not raise an informative ValueError
(include action_index and action.origin); apply this check before handling
PassType.TERMINATE and before calling
run_qiskit_action/run_tket_action/run_bqskit_action so masked TERM and SDK
actions are rejected rather than executed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, this should not happen (or do you see a path?)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flowerthrower, you're right — in the normal RL training loop the sequence is:

policy → action_masks()step()apply_action()

By the time apply_action() is reached, the policy has already used the mask to select only structurally and SDK-valid actions, so an additional guard inside apply_action() would be redundant, for exactly the same reason as the existing guard-free design of step(). I don't see a reachable path in the current codebase where a masked-out action could arrive there. Withdrawing the suggestion.


✏️ Learnings added
Learnt from: flowerthrower
Repo: munich-quantum-toolkit/predictor PR: 680
File: src/mqt/predictor/rl/predictorenv.py:325-360
Timestamp: 2026-06-08T13:25:02.753Z
Learning: In munich-quantum-toolkit/predictor, `PredictorEnv.apply_action()` in `src/mqt/predictor/rl/predictorenv.py` does not re-validate the requested action against `action_masks()` before dispatching to `run_*_action` helpers. This is intentional: `apply_action()` is only ever called from `step()`, which itself is only reached after the RL policy has already filtered actions through `action_masks()`, making an additional guard inside `apply_action()` redundant.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: flowerthrower
Repo: munich-quantum-toolkit/predictor PR: 680
File: src/mqt/predictor/rl/predictorenv.py:140-143
Timestamp: 2026-06-08T13:11:33.062Z
Learning: In munich-quantum-toolkit/predictor, `PredictorEnv.step()` in `src/mqt/predictor/rl/predictorenv.py` does not re-validate the requested action against `self.valid_actions` or `action_masks()` before calling `apply_action()`. This is intentional: step() is only called in a tightly controlled RL training loop after actions have already been filtered by masking, so an additional guard would be redundant.

Learnt from: flowerthrower
Repo: munich-quantum-toolkit/predictor PR: 680
File: src/mqt/predictor/rl/predictorenv.py:140-143
Timestamp: 2026-06-08T13:11:33.062Z
Learning: In munich-quantum-toolkit/predictor, the broad `except Exception` catch (with `# noqa: BLE001`) in `PredictorEnv.step()` in `src/mqt/predictor/rl/predictorenv.py` is intentional. Qiskit, TKET, and BQSKit compilation passes can raise a wide variety of unpredictable exceptions, and catching them all to truncate the episode gracefully is the desired design.

Learnt from: denialhaag
Repo: munich-quantum-toolkit/predictor PR: 572
File: tests/compilation/test_integration_further_SDKs.py:20-20
Timestamp: 2026-01-21T00:18:21.993Z
Learning: In the munich-quantum-toolkit/predictor repository, Ruff is configured with preview mode enabled (`preview = true`). The PLC2701 rule (import from private modules) is active, so `# noqa: PLC2701` directives are necessary when importing from private modules like `pytket._tket.passes`, even if Ruff reports RUF100 warnings suggesting the directive is unused.

Learnt from: flowerthrower
Repo: munich-quantum-toolkit/predictor PR: 680
File: src/mqt/predictor/rl/predictorenv.py:140-143
Timestamp: 2026-06-08T13:11:33.062Z
Learning: In munich-quantum-toolkit/predictor, `PredictorEnv.reset()` in `src/mqt/predictor/rl/predictorenv.py` unconditionally sets `self.layout = None`. This is intentional: reset() is only called at the start of a new rollout from the fully uncompiled training circuit set, so no pre-existing layout should ever be assumed or preserved.

Learnt from: denialhaag
Repo: munich-quantum-toolkit/predictor PR: 572
File: src/mqt/predictor/rl/predictorenv.py:0-0
Timestamp: 2026-01-21T00:17:19.184Z
Learning: In the munich-quantum-toolkit/predictor repository, Ruff is configured with preview = true and the "PL" (pylint) rule category in extend-select. PLC2701 (import-private-name) is a preview rule that flags imports from private modules (names starting with underscore). The `# noqa: PLC2701` directive on the import `from pytket._tket.passes import BasePass as TketBasePass` in src/mqt/predictor/rl/predictorenv.py is necessary and appropriate, even if Ruff reports RUF100 warnings suggesting the directive is unused.

Learnt from: flowerthrower
Repo: munich-quantum-toolkit/predictor PR: 526
File: src/mqt/predictor/rl/predictorenv.py:238-243
Timestamp: 2026-02-09T13:15:50.387Z
Learning: In src/mqt/predictor/rl/predictorenv.py, when reward_function is 'estimated_hellinger_distance', compute the Hellinger-distance reward at every step rather than only at episode termination. Currently, non-terminal steps receive no_effect_penalty, which prevents intermediate shaping signals. Implement delta-based reward shaping by producing or accumulating a per-step shaping signal (analogous to how it's done for expected_fidelity and estimated_success_probability) so that intermediate rewards reflect progress toward the goal. Maintain the existing terminal reward behavior on episode end. Add or adjust tests to verify step-wise rewards for this case and ensure the shaping signal is correctly integrated into the total return.

Learnt from: burgholzer
Repo: munich-quantum-toolkit/predictor PR: 526
File: src/mqt/predictor/rl/predictorenv.py:271-271
Timestamp: 2025-12-25T13:28:19.850Z
Learning: In the munich-quantum-toolkit/predictor repository, Ruff has a broad rule set including FLake8-SLF (slf) in extend-select. For private member access (e.g., self.state._layout = self.layout) in src/mqt/predictor/rl/predictorenv.py, include a per-file noqa directive: # noqa: SLF001. This is appropriate even if Ruff reports RUF100 as unused, to acknowledge intentional private attribute access and to avoid false positives in this specific code path. Apply this directive only to the files where private-member access is intentionally used and where SLF001 is the correct rationale.

Learnt from: linus-hologram
Repo: munich-quantum-toolkit/predictor PR: 641
File: src/mqt/predictor/rl/tracer.py:172-186
Timestamp: 2026-04-13T21:24:06.882Z
Learning: In the munich-quantum-toolkit/predictor repository, Ruff’s flake8-boolean-trap integration (FBT) is not enabled (FBT is not in Ruff’s `extend-select` in `pyproject.toml`), and the specific rule `FBT001` is not active. Therefore, during code reviews for this repo, do not flag boolean-typed positional arguments as `FBT001` Ruff violations unless `FBT001` is re-enabled in `pyproject.toml`.

Comment on lines +36 to +45
def _setup_env(env: PredictorEnv, circuit: QuantumCircuit, layout: TranspileLayout | None, n_qubits: int) -> None:
"""Reset env to the given circuit/layout state without starting a full RL episode."""
env.reset(qc=circuit.copy())
env.layout = layout
env.num_qubits_uncompiled_circuit = n_qubits

def test_bqskit_o2_action(available_actions_dict: dict[PassType, list[Action]]) -> None:
"""Test the BQSKitO2 action."""
action_bqskit_o2: Action | None = None
for action in available_actions_dict[PassType.OPT]:
if action.name == "BQSKitO2":
action_bqskit_o2 = action
assert action_bqskit_o2 is not None

qc = QuantumCircuit(2)
qc.h(0)
qc.cx(0, 1)

bqskit_qc = qiskit_to_bqskit(qc)
factory = cast("Callable[[Circuit], Circuit]", action_bqskit_o2.transpile_pass)
bqskit_qc_optimized = factory(bqskit_qc)
assert isinstance(bqskit_qc_optimized, Circuit)
optimized_qc = bqskit_to_qiskit(bqskit_qc_optimized)
def _is_available(env: PredictorEnv, idx: int) -> bool:
"""Return True if action idx is structurally and SDK-valid for the current env state."""
env.valid_actions = env.determine_valid_actions_for_state()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Use Google-style docstrings for helper functions with parameters and return values.

_setup_env and _is_available currently use short-form docstrings despite having non-trivial signatures.

Proposed docstring update
 def _setup_env(env: PredictorEnv, circuit: QuantumCircuit, layout: TranspileLayout | None, n_qubits: int) -> None:
-    """Reset env to the given circuit/layout state without starting a full RL episode."""
+    """Reset env to a given circuit/layout state without starting a full RL episode.
+
+    Args:
+        env: Predictor environment under test.
+        circuit: Circuit to load as the current state.
+        layout: Layout metadata to attach to the environment state.
+        n_qubits: Number of qubits in the original uncompiled circuit.
+    """
@@
 def _is_available(env: PredictorEnv, idx: int) -> bool:
-    """Return True if action idx is structurally and SDK-valid for the current env state."""
+    """Check whether an action index is currently available.
+
+    Args:
+        env: Predictor environment under test.
+        idx: Index in ``env.action_set``.
+
+    Returns:
+        True if the action is structurally valid and SDK-available.
+    """

As per coding guidelines, "**/*.py: MUST use Google-style docstrings in Python code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/compilation/test_integration_further_SDKs.py` around lines 36 - 45,
Both helper functions use short one-line docstrings—replace them with
Google-style docstrings: for _setup_env(PredictorEnv env, QuantumCircuit
circuit, TranspileLayout|None layout, int n_qubits) include an Args section
describing each parameter and a Returns section noting None and the side-effect
of resetting the env; for _is_available(PredictorEnv env, int idx) include an
Args section for env and idx and a Returns section describing that it returns
bool indicating whether the action is structurally and SDK-valid and that it
updates env.valid_actions via determine_valid_actions_for_state. Ensure
parameter types match the signatures and keep descriptions concise.

input_qubit_mapping=dict(layout_before.input_qubit_mapping),
final_layout=pm.property_set.get("final_layout"),
_output_qubit_list=routed.qubits,
_input_qubit_count=layout_before._input_qubit_count, # noqa: SLF001
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Avoid private TranspileLayout field access and remove the SLF suppression.

Using _input_qubit_count couples tests to private internals; this can be derived from public metadata.

Proposed change
-        _input_qubit_count=layout_before._input_qubit_count,  # noqa: SLF001
+        _input_qubit_count=len(layout_before.input_qubit_mapping),
@@
-        _input_qubit_count=layout._input_qubit_count,  # noqa: SLF001
+        _input_qubit_count=len(layout.input_qubit_mapping),

As per coding guidelines, "PREFER fixing reported warnings over suppressing them ... only add ignore rules when necessary and document why".

Also applies to: 126-126

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/compilation/test_integration_further_SDKs.py` at line 107, Replace the
private access layout_before._input_qubit_count by deriving the count from
public metadata (e.g. compute it as len(layout_before.initial_layout) or another
public property that lists the input qubits) and remove the "# noqa: SLF001"
suppression; update the call site that used _input_qubit_count to use the
computed integer instead (reference symbols: layout_before and
_input_qubit_count).

Comment on lines +234 to +243
assert env.is_circuit_laid_out(compiled, layout), (
f"{action.name} on {env.device.description} VIOLATED INVARIANT preserves_layout: "
f"circuit no longer has a valid layout after action"
)
if env.layout is not None:
post_v2p = dict(env.layout.initial_layout.get_virtual_bits())
assert post_v2p == pre_v2p, (
f"{action.name} on {env.device.description} VIOLATED INVARIANT preserves_layout: "
f"initial qubit assignment changed. Before: {pre_v2p}, After: {post_v2p}"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Strengthen preserves_layout assertions to validate env.layout directly.

The current check can pass even when an action drops layout metadata, because mapping comparison is skipped when env.layout is None.

Proposed tightening
-                assert env.is_circuit_laid_out(compiled, layout), (
+                assert env.layout is not None, (
+                    f"{action.name} on {env.device.description} VIOLATED INVARIANT preserves_layout: "
+                    f"layout metadata was dropped"
+                )
+                assert env.is_circuit_laid_out(compiled, env.layout), (
                     f"{action.name} on {env.device.description} VIOLATED INVARIANT preserves_layout: "
                     f"circuit no longer has a valid layout after action"
                 )
-                if env.layout is not None:
-                    post_v2p = dict(env.layout.initial_layout.get_virtual_bits())
-                    assert post_v2p == pre_v2p, (
-                        f"{action.name} on {env.device.description} VIOLATED INVARIANT preserves_layout: "
-                        f"initial qubit assignment changed. Before: {pre_v2p}, After: {post_v2p}"
-                    )
+                post_v2p = dict(env.layout.initial_layout.get_virtual_bits())
+                assert post_v2p == pre_v2p, (
+                    f"{action.name} on {env.device.description} VIOLATED INVARIANT preserves_layout: "
+                    f"initial qubit assignment changed. Before: {pre_v2p}, After: {post_v2p}"
+                )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/compilation/test_integration_further_SDKs.py` around lines 234 - 243,
The preserves_layout check can be bypassed when env.layout is None; change the
conditional so the test fails if layout was dropped: assert env.layout is not
None, (f"{action.name} on {env.device.description} VIOLATED INVARIANT
preserves_layout: layout metadata was removed"), then compute post_v2p =
dict(env.layout.initial_layout.get_virtual_bits()) and assert post_v2p ==
pre_v2p with the existing descriptive message; keep the existing
is_circuit_laid_out(compiled, layout) assertion but replace the optional if
block around env.layout with an explicit assertion that env.layout remains
present before comparing mappings (references: env.layout,
initial_layout.get_virtual_bits(), pre_v2p, preserves_layout).

Comment thread tests/compilation/test_predictor_rl.py Outdated
Comment on lines +182 to +183
env.action_set[0] = action
altered_qc = env.apply_action(action_index=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether PredictorEnv keeps an alias or a copy of the action registry.

set -euo pipefail

echo "=== PredictorEnv action_set assignment ==="
rg -n --type=py -C5 'self\.action_set\s*=' src/mqt/predictor/rl/predictorenv.py

echo
echo "=== Action registry declarations and accessors ==="
rg -n --type=py -C5 'ACTION_SET|action_set|def register_action|def remove_action|def get_actions_by_pass_type|def get_actions' src/mqt/predictor/rl/actions

echo
echo "=== Any in-test direct action_set mutations ==="
rg -n --type=py -C2 'action_set\[[^]]+\]\s*=' tests

Repository: munich-quantum-toolkit/predictor

Length of output: 3403


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== All occurrences of action_set in predictorenv.py ==="
rg -n --type=py 'action_set' src/mqt/predictor/rl/predictorenv.py

echo
echo "=== predictorenv.py: snippet around __init__ (action_set init) ==="
sed -n '50,110p' src/mqt/predictor/rl/predictorenv.py

echo
echo "=== predictorenv.py: snippet around apply_action ==="
rg -n --type=py 'def apply_action' -n src/mqt/predictor/rl/predictorenv.py | head
# print a larger window around the first match
python3 - <<'PY'
import re, pathlib
p = pathlib.Path("src/mqt/predictor/rl/predictorenv.py")
text = p.read_text()
m = re.search(r'def apply_action\s*\([^)]*\):', text)
if not m:
    m = re.search(r'def apply_action\b', text)
if not m:
    raise SystemExit("apply_action not found")
start = text.rfind('\n', 0, m.start())+1
# line numbers
lines = text.splitlines(True)
# compute line index
line_no = text[:m.start()].count('\n')+1
print(f"apply_action starts around line {line_no}")
lo = max(1, line_no-60)
hi = line_no+140
import subprocess, os, sys
print(subprocess.check_output(["bash","-lc",f"sed -n '{lo},{hi}p' src/mqt/predictor/rl/predictorenv.py"], text=True))
PY

echo
echo "=== tests/compilation/test_predictor_rl.py: snippet around the mutation ==="
sed -n '150,230p' tests/compilation/test_predictor_rl.py

Repository: munich-quantum-toolkit/predictor

Length of output: 14621


env.action_set is per-instance; mutating index 0 won’t leak across tests.

PredictorEnv constructs self.action_set as a new dict in __init__ and fills it from get_actions_by_pass_type(), so env.action_set[0] = action only changes that environment instance (it doesn’t alias the global registry).

Remaining brittleness: the test hardcodes 0; prefer assigning into an existing routing index (e.g., from env.actions_routing_indices) to avoid coupling to internal action ordering.

File: tests/compilation/test_predictor_rl.py (lines 182-183)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/compilation/test_predictor_rl.py` around lines 182 - 183, The test
currently mutates env.action_set[0] which is safe because PredictorEnv builds
self.action_set per-instance, but it’s brittle to hardcode index 0; change the
test to pick a valid routing index from the environment (e.g., use an index from
env.actions_routing_indices or another existing key in env.action_set) and
assign the action to that index instead of 0 so the test does not depend on
internal action ordering in PredictorEnv.

@flowerthrower flowerthrower marked this pull request as ready for review June 8, 2026 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🎨 improve action/pass imports and wrappers

1 participant