Conversation
There was a problem hiding this comment.
Pull request overview
This PR strengthens ARC’s Transition State (TS) validation logic by improving the statistical robustness of the NMD (Normal Mode Displacement) check and making the IRC (Intrinsic Reaction Coordinate) endpoint validation chemically aware via molecular graph isomorphism, with a fallback to the previous bond-list approach.
Changes:
- Update NMD TS validation to use robust statistics (median + MAD) and add a formed/broken bond directionality gate.
- Update IRC endpoint validation to perceive fragments and validate reactant/product identity via graph isomorphism (with a connectivity fallback).
- Add/adjust unit tests for the new NMD and IRC behaviors.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
arc/checks/ts.py |
Adds fragment perception + isomorphism-based IRC validation with fallback. |
arc/checks/ts_test.py |
Adds tests for fragment perception, fragment/species matching, and IRC endpoint swapping. |
arc/checks/nmd.py |
Adds directionality gate, module constants, and switches baseline/spread to median+MAD. |
arc/checks/nmd_test.py |
Updates expectations for new robust stats and adds tests for directionality + constants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #838 +/- ##
========================================
Coverage 58.83% 58.83%
========================================
Files 97 97
Lines 29355 29502 +147
Branches 7791 7831 +40
========================================
+ Hits 17271 17358 +87
- Misses 9877 9916 +39
- Partials 2207 2228 +21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """ | ||
| diffs = list() | ||
| report = None | ||
| amplitude = amplitude or 1.0 |
There was a problem hiding this comment.
I know that this code isn't new or being updated but I do think it's worth pointing out that since amplitude = amplitude or 1.0 then amplitude can never be None so the first half check we have at line 445 is redundant
arc/checks/nmd.py
Outdated
There was a problem hiding this comment.
Correct me if I'm but since diff is the change between the two xyzs and those geometries were already generated with amplitude applied in get_displace_xyzs(), hasn't the amplitude already influenced the result. So we are scaling it a second time. Is that intended?
|
Thanks for the comments, @calvinp0! |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
arc/checks/nmd.py:456
- After the change to the “insignificant motion” gate, the
amplitudeparameter inget_bond_length_changes()no longer affects the calculation (it’s only used for the debugreportstring and is defaulted to 1.0). This is a bit misleading given the docstring/argument list; consider either removingamplitudefrom the signature (and callers) or updating the docstring to clarify it’s only used for reporting/logging now.
amplitude = amplitude or 1.0
for bond in bonds:
r_bond_length = get_bond_length_in_reaction(bond=bond, xyz=xyzs[0], weights=weights)
p_bond_length = get_bond_length_in_reaction(bond=bond, xyz=xyzs[1], weights=weights)
if r_bond_length is None or p_bond_length is None:
continue
diff = abs(r_bond_length - p_bond_length)
if return_none_if_change_is_insignificant \
and abs(diff / r_bond_length) < 0.05 \
and abs(diff / p_bond_length) < 0.05:
return None, None
diffs.append(diff)
if considered_reactive:
report = f'{float(r_bond_length):.2f} {float(p_bond_length):.2f} {float(diff):.2f} {amplitude}'
diffs = np.array(diffs)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
be65b3e to
a4e8b48
Compare
calvinp0
left a comment
There was a problem hiding this comment.
Looking good, just a few comments first
|
Thanks @calvinp0, excellent comments. I addressed them in two fixup commits. let me know if it looks good and I'll squash |
NMD Improvements (arc/checks/ts.py): - Replaced mean-based displacement baseline with Median + MAD (Median Absolute Deviation) to insulate the check from floppy rotors. - Implemented a mandatory Directionality Check: ensures formed and broken bonds move in anti-correlated directions along the imaginary mode. - Separated primary (formed/broken) from secondary (changed-order) bonds in the sigma test to reflect physical displacement scales. - Set a global numerical noise floor (1e-4 A) for the Hessian and raised the default validation threshold to 3.0 sigma.
This reaction was reported to be problematic in the past in the context of NMD checks
IRC Improvements (arc/checks/ts.py): - Upgraded 'check_irc_species_and_rxn' to use molecular graph isomorphism as the primary validation method. - Added 'perceive_irc_fragments' using DFS-based connected component detection to handle multi-species reactions (e.g., A + B) reliably. - Implemented permutation-based matching to verify that the set of perceived IRC fragments matches the expected reaction species. - Retained distance-matrix bond-list comparison as a robust fallback.
This PR significantly improves how ARC validates Transition States by making the NMD (Normal Mode Displacement) check more statistically sound and the IRC (Intrinsic Reaction Coordinate) check chemically aware.
1. NMD: Smarter "Sanity Checks"
The NMD check is our first line of defense. It now uses Median + MAD (Median Absolute Deviation) statistics instead of simple averages.
2. IRC: From Geometry to Chemistry
Previously, the IRC check only looked at a "list of distances." It now performs Full Molecular Graph Isomorphism.
3. Verification
arc/checks/ts_test.pypass.