Skip to content

denovochem/rdchiral_plus

 
 

Repository files navigation

PyPI version Maintenance License Run Tests Build Docs Open In Colab

rdchiral_plus

Wrapper for RDKit's RunReactants to improve stereochemistry handling

This repository is a fork of rdchiral. It has been modified for improved performance while maintaining high consistency with the upstream library. These modifications provide speed that is marginally slower than the fast C++ version (rdchiral_cpp), but has the benefits of being written in Python. This library is pip installable cross platform.

The interface (rdchiralRun, rdchiralRunText, rdchiralReaction, rdchiralReactants, rdchiralExtract, etc.) and returned data structures remain unchanged from the original library, so existing code should work with no modifications. While behavior is mostly consistent with the original library, this fork includes several important fixes and improvements.

Template application

  • Conjugated system bond direction correction: Corrects corrupted single-bond directions (ENDUPRIGHT, ENDDOWNRIGHT) in conjugated systems. Implemented from here
  • Broader stereochemistry handling: Stereochemistry for tetrahedral centers with lone pairs is accounted for
  • One-pot reactions: Templates are initialized with parentheses where needed so that templates defining multiple reactions on the same product are properly handled
  • Recursive template application: Templates can be recursively applied with a max_depth parameter, useful for symmetric reactions, or reactions that occur at multiple sites in a molecule

Template extraction

  • Configurable template extraction: Template extraction supports configurable radius and special group handling. Implemented from here
  • Deterministic template extraction: Replaced random shuffle-based tetrahedral center correction loops with deterministic permutation parity - the old behavior could lead to inconsistent results or hang in rare instances.
  • Stereochemistry tracking: Inversions of tetrahedral centers are counted as a changed atom, and included in the extracted template
  • Spectator tracking: Spectator molecules are included in extracted template dictionaries

General

  • Automatic dependency installation: RDKit is automatically installed as a dependency

Consistency with the original library

The changes above result in minor differences in behavior compared to the original library. In most cases where behavior is different, rdchiral_plus produces the more accurate result. The table below shows the roundtripability of extracting a template from atom mapped reaction SMILES, and then applying that template to the product SMILES to recover the expected reactant SMILES. rdchiral_plus reduces the number of incorrect roundtrips by 90% compared to rdchiral, and 94% compared to rdchiral_cpp.

library successful roundtrips success rate
rdchiral 49223 / 50016 98.41%
rdchiral_cpp 48694 / 50016 97.36%
rdchiral_plus 49935 / 50016 99.84%

See here for details on how consistency is measured against the original library and full details of what changes you can expect compared to the original rdchiral library.

Requirements

  • RDKit (version >= 2019)
  • Python (version >= 3.10)

Installation

Install rdchiral_plus from PyPI:

pip install rdchiral-plus

Or install rdchiral_plus with pip directly from this repo:

pip install git+https://github.com/denovochem/rdchiral_plus.git

This fork can be optionally compiled with mypyc. In our testing performance is not noticeably improved, as most of the computationally expensive work in this library is done with rdkit, which is already primarily written in C++.

For mypyc compilation:

RDCHIRAL_USE_MYPYC=1 pip install "git+https://github.com/denovochem/rdchiral_plus.git"

Basic usage

from rdchiral import rdchiralRunText, rdchiralReaction, rdchiralReactants

# Run directly from SMARTS and SMILES strings
# This is slower than pre-initializing rdchiralReaction and rdchiralReactants when
# processing a large number of reactions
reaction_smarts = '[C:1][OH:2]>>[C:1][O:2][C]'
reactant_smiles = 'OCC(=O)OCCCO'
outcomes = rdchiralRunText(reaction_smarts, reactant_smiles)
print(outcomes)

# Pre-initialize then run
rxn = rdchiralReaction(reaction_smarts)
reactants = rdchiralReactants(reactant_smiles)
outcomes = rdchiralRun(rxn, reactants)
print(outcomes)

# Get list of atoms that changed
outcomes, mapped_outcomes = rdchiralRun(rxn, reactants, return_mapped=True)
print(outcomes, mapped_outcomes)

Documentation

Full documentation is available here

Contributing

  • Feature ideas and bug reports are welcome on the Issue Tracker.
  • Fork the source code on GitHub, make changes and file a pull request.

License

rdchiral_plus is licensed under the MIT license.

References

About

Wrapper for RDKit's RunReactants to improve stereochemistry handling. Forked from rdchiral and modified for improved speed and more robust behavior.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 100.0%