Skip to content

Improved the TS NMD & IRC checks#838

Merged
alongd merged 6 commits intomainfrom
nmd_irc
Apr 3, 2026
Merged

Improved the TS NMD & IRC checks#838
alongd merged 6 commits intomainfrom
nmd_irc

Conversation

@alongd
Copy link
Copy Markdown
Member

@alongd alongd commented Mar 24, 2026

This PR significantly improves how ARC validates Transition States by making the NMD (Normal Mode Displacement) check more statistically sound and the IRC (Intrinsic Reaction Coordinate) check chemically aware.


1. NMD: Smarter "Sanity Checks"

The NMD check is our first line of defense. It now uses Median + MAD (Median Absolute Deviation) statistics instead of simple averages.

  • Why? This prevents "floppy" parts of a molecule (like a rotating methyl group) from ruining the vibration check for the actual reactive center.
  • Directionality Gate: Added a physical check to ensure that as a bond "breaks," the forming bond "shortens" (and vice versa). If they move in the same direction, the TS is rejected as non-reactive.

2. IRC: From Geometry to Chemistry

Previously, the IRC check only looked at a "list of distances." It now performs Full Molecular Graph Isomorphism.

  • Fragment Awareness: It now uses a DFS-based search to identify individual molecules (e.g., $A + B$) at the IRC endpoints.
  • Chemical Identity: It verifies bond orders, formal charges, and radical states. It won't just ask "Are these atoms near each other?" but "Is this actually the Molecule I expected?"
  • Fallback: If the complex graph perception fails, it safely falls back to the original bond-list comparison to avoid "False Negatives."

3. Verification

  • Added 9 new tests covering multi-fragment perception, charge routing, and swapped IRC endpoints.
  • All 25 tests in arc/checks/ts_test.py pass.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR strengthens ARC’s Transition State (TS) validation logic by improving the statistical robustness of the NMD (Normal Mode Displacement) check and making the IRC (Intrinsic Reaction Coordinate) endpoint validation chemically aware via molecular graph isomorphism, with a fallback to the previous bond-list approach.

Changes:

  • Update NMD TS validation to use robust statistics (median + MAD) and add a formed/broken bond directionality gate.
  • Update IRC endpoint validation to perceive fragments and validate reactant/product identity via graph isomorphism (with a connectivity fallback).
  • Add/adjust unit tests for the new NMD and IRC behaviors.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
arc/checks/ts.py Adds fragment perception + isomorphism-based IRC validation with fallback.
arc/checks/ts_test.py Adds tests for fragment perception, fragment/species matching, and IRC endpoint swapping.
arc/checks/nmd.py Adds directionality gate, module constants, and switches baseline/spread to median+MAD.
arc/checks/nmd_test.py Updates expectations for new robust stats and adds tests for directionality + constants.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.83%. Comparing base (69df219) to head (7d96eef).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##             main     #838    +/-   ##
========================================
  Coverage   58.83%   58.83%            
========================================
  Files          97       97            
  Lines       29355    29502   +147     
  Branches     7791     7831    +40     
========================================
+ Hits        17271    17358    +87     
- Misses       9877     9916    +39     
- Partials     2207     2228    +21     
Flag Coverage Δ
functionaltests 58.83% <ø> (+<0.01%) ⬆️
unittests 58.83% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"""
diffs = list()
report = None
amplitude = amplitude or 1.0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this code isn't new or being updated but I do think it's worth pointing out that since amplitude = amplitude or 1.0 then amplitude can never be None so the first half check we have at line 445 is redundant

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm but since diff is the change between the two xyzs and those geometries were already generated with amplitude applied in get_displace_xyzs(), hasn't the amplitude already influenced the result. So we are scaling it a second time. Is that intended?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I agree

@alongd
Copy link
Copy Markdown
Member Author

alongd commented Mar 31, 2026

Thanks for the comments, @calvinp0!
All were addressed and squashed. One test fails due to env installation, strange. Other than that, we're ready for another review round

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

arc/checks/nmd.py:456

  • After the change to the “insignificant motion” gate, the amplitude parameter in get_bond_length_changes() no longer affects the calculation (it’s only used for the debug report string and is defaulted to 1.0). This is a bit misleading given the docstring/argument list; consider either removing amplitude from the signature (and callers) or updating the docstring to clarify it’s only used for reporting/logging now.
    amplitude = amplitude or 1.0

    for bond in bonds:
        r_bond_length = get_bond_length_in_reaction(bond=bond, xyz=xyzs[0], weights=weights)
        p_bond_length = get_bond_length_in_reaction(bond=bond, xyz=xyzs[1], weights=weights)
        if r_bond_length is None or p_bond_length is None:
            continue
        diff = abs(r_bond_length - p_bond_length)
        if return_none_if_change_is_insignificant \
                and abs(diff / r_bond_length) < 0.05 \
                and abs(diff / p_bond_length) < 0.05:
            return None, None
        diffs.append(diff)
        if considered_reactive:
            report = f'{float(r_bond_length):.2f} {float(p_bond_length):.2f} {float(diff):.2f} {amplitude}'
    diffs = np.array(diffs)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@alongd alongd force-pushed the nmd_irc branch 2 times, most recently from be65b3e to a4e8b48 Compare April 3, 2026 11:02
Copy link
Copy Markdown
Member

@calvinp0 calvinp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a few comments first

@alongd
Copy link
Copy Markdown
Member Author

alongd commented Apr 3, 2026

Thanks @calvinp0, excellent comments. I addressed them in two fixup commits. let me know if it looks good and I'll squash

Copy link
Copy Markdown
Member

@calvinp0 calvinp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

alongd added 6 commits April 3, 2026 19:16
NMD Improvements (arc/checks/ts.py):
- Replaced mean-based displacement baseline with Median + MAD (Median 
  Absolute Deviation) to insulate the check from floppy rotors.
- Implemented a mandatory Directionality Check: ensures formed and broken 
  bonds move in anti-correlated directions along the imaginary mode.
- Separated primary (formed/broken) from secondary (changed-order) bonds 
  in the sigma test to reflect physical displacement scales.
- Set a global numerical noise floor (1e-4 A) for the Hessian and 
  raised the default validation threshold to 3.0 sigma.
This reaction was reported to be problematic in the past in the context of NMD checks
IRC Improvements (arc/checks/ts.py):
- Upgraded 'check_irc_species_and_rxn' to use molecular graph 
  isomorphism as the primary validation method.
- Added 'perceive_irc_fragments' using DFS-based connected component 
  detection to handle multi-species reactions (e.g., A + B) reliably.
- Implemented permutation-based matching to verify that the set of 
  perceived IRC fragments matches the expected reaction species.
- Retained distance-matrix bond-list comparison as a robust fallback.
@alongd alongd merged commit 881a614 into main Apr 3, 2026
8 checks passed
@alongd alongd deleted the nmd_irc branch April 3, 2026 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants