Wrapper for RDKit's RunReactants to improve stereochemistry handling
This repository is a fork of rdchiral. It has been modified for improved performance while maintaining high consistency with the upstream library. These modifications provide speed that is marginally slower than the fast C++ version (rdchiral_cpp), but has the benefits of being written in Python. This library is pip installable cross platform.
The interface (rdchiralRun, rdchiralRunText, rdchiralReaction, rdchiralReactants, rdchiralExtract, etc.) and returned data structures remain unchanged from the original library, so existing code should work with no modifications. While behavior is mostly consistent with the original library, this fork includes several important fixes and improvements.
- Conjugated system bond direction correction: Corrects corrupted single-bond directions (ENDUPRIGHT, ENDDOWNRIGHT) in conjugated systems. Implemented from here
- Broader stereochemistry handling: Stereochemistry for tetrahedral centers with lone pairs is accounted for
- One-pot reactions: Templates are initialized with parentheses where needed so that templates defining multiple reactions on the same product are properly handled
- Recursive template application: Templates can be recursively applied with a max_depth parameter, useful for symmetric reactions, or reactions that occur at multiple sites in a molecule
- Configurable template extraction: Template extraction supports configurable radius and special group handling. Implemented from here
- Deterministic template extraction: Replaced random shuffle-based tetrahedral center correction loops with deterministic permutation parity - the old behavior could lead to inconsistent results or hang in rare instances.
- Stereochemistry tracking: Inversions of tetrahedral centers are counted as a changed atom, and included in the extracted template
- Spectator tracking: Spectator molecules are included in extracted template dictionaries
- Automatic dependency installation: RDKit is automatically installed as a dependency
The changes above result in minor differences in behavior compared to the original library. In most cases where behavior is different, rdchiral_plus produces the more accurate result. The table below shows the roundtripability of extracting a template from atom mapped reaction SMILES, and then applying that template to the product SMILES to recover the expected reactant SMILES. rdchiral_plus reduces the number of incorrect roundtrips by 90% compared to rdchiral, and 94% compared to rdchiral_cpp.
| library | successful roundtrips | success rate |
|---|---|---|
| rdchiral | 49223 / 50016 | 98.41% |
| rdchiral_cpp | 48694 / 50016 | 97.36% |
| rdchiral_plus | 49935 / 50016 | 99.84% |
See here for details on how consistency is measured against the original library and full details of what changes you can expect compared to the original rdchiral library.
- RDKit (version >= 2019)
- Python (version >= 3.10)
Install rdchiral_plus from PyPI:
pip install rdchiral-plusOr install rdchiral_plus with pip directly from this repo:
pip install git+https://github.com/denovochem/rdchiral_plus.gitThis fork can be optionally compiled with mypyc. In our testing performance is not noticeably improved, as most of the computationally expensive work in this library is done with rdkit, which is already primarily written in C++.
For mypyc compilation:
RDCHIRAL_USE_MYPYC=1 pip install "git+https://github.com/denovochem/rdchiral_plus.git"from rdchiral import rdchiralRunText, rdchiralReaction, rdchiralReactants
# Run directly from SMARTS and SMILES strings
# This is slower than pre-initializing rdchiralReaction and rdchiralReactants when
# processing a large number of reactions
reaction_smarts = '[C:1][OH:2]>>[C:1][O:2][C]'
reactant_smiles = 'OCC(=O)OCCCO'
outcomes = rdchiralRunText(reaction_smarts, reactant_smiles)
print(outcomes)
# Pre-initialize then run
rxn = rdchiralReaction(reaction_smarts)
reactants = rdchiralReactants(reactant_smiles)
outcomes = rdchiralRun(rxn, reactants)
print(outcomes)
# Get list of atoms that changed
outcomes, mapped_outcomes = rdchiralRun(rxn, reactants, return_mapped=True)
print(outcomes, mapped_outcomes)Full documentation is available here
- Feature ideas and bug reports are welcome on the Issue Tracker.
- Fork the source code on GitHub, make changes and file a pull request.
rdchiral_plus is licensed under the MIT license.