Skip to content

Commit ccd459a

Browse files
committed
Experimental ligand support
1 parent 9e4dcc1 commit ccd459a

17 files changed

Lines changed: 5306 additions & 102 deletions

README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ A utility to automatically prepare structures from the PDB for molecular dynamic
1212
* [X] Automatically trim together structures to be the same length
1313
* [X] Run simple MD simulations for testing, validation and minimisation
1414
* [X] Create 'morph' trajectories with metadynamics
15+
* [X] Automatically extract and fix hetatms\ligands
16+
* [X] Output PQR files
1517
* [ ] Automatically propagate metadata through to finalised structure files
1618
* [ ] AIIDA integration
1719

@@ -36,6 +38,7 @@ A utility to automatically prepare structures from the PDB for molecular dynamic
3638
By default, `prepmd` will read missing residues from the pdb/mmcif metadata, attempt to align the missing residues with the currently present residues, and then build missing loops. You can manually provide a FASTA file containing the alignment data with `--fasta`. You can also ask prepmd to get the sequence data from UNIPROT instead, with `--download`, though this is not recommended, as the raw sequence data can be different from the PDB and cause the alignment to fail.
3739
### Other usage notes
3840
* `prepmd` will attempt to guess the correct file format from the filenames it's given. It won't perform implicit conversions, so make sure to start and end with the same file type.
41+
* By default, `prepmd` removes ligands and other molecules from the input and saves each residue to a separate SDF file. You can disable this behaviour with the `--ignore_hettams` flag.
3942
* By default, `prepmd` will leave intermediate files in a randomly-named temporary directory. You can set the name of this directory: `prepmd --wdir 6xov_temp 6xov 6xov.cif`.
4043
* While both pdb and mmCif are supported, using the mmCif format is strongly recommended, as the pdb format has been deprecated since 2024.
4144
* Use `prepmd --help` for a full list of parameters.
@@ -51,6 +54,8 @@ By default, `prepmd` will read missing residues from the pdb/mmcif metadata, att
5154
`runmd structure.cif -o structure_minimised.cif --traj_out traj.xtc --md_steps 500 --step 50 -ff amber14` runs with amber14. charmm36, amoeba, amber14 and amber19 are available, with charmm36 being the default.
5255
### Equilibrate side chains:
5356
`runmd structure.cif -o structure_minimised.cif --fix_backbone -solv tip4pew --notest` will fix the backbone in place and only equilibrate side chains.
57+
### Add ligands:
58+
`runmd structure.cif -l LIG.sdf -ff amber14` runs a simulation with a ligand. You can add multiple ligands by using the `-l` argument multiple times. Ligands are simulated using OpenFF. OpenFF has limited compatibility with force fields and solvent models, so ligand simulations only run with the amber14 force field and explicit solvent. By default, ligand simulations also run with a smaller timestep.
5459
### Create a morph trajectory:
5560
`runmd pre.cif -m post.cif -o minimised_out.pdb` will create a trajectory that smoothly transitions between pre.cif and post.cif. This trajectory is created using OpenMM's metadynamics features. Note: this should only be used for visualisation/illustration as trajectories created this way are arbitrary representations of structural transitions that aren't guaranteed to represent the underlying physics and biology.
5661
If you have two files for the same structure which aren't aligned (e.g. they have slightly different starting/ending residues), you can trim the ends to align them: `aligntogether pre.cif post.cif pre_cropped.cif post_cropped.cif`
@@ -69,7 +74,7 @@ If you have two files for the same structure which aren't aligned (e.g. they hav
6974
AGPLv3
7075

7176
## Contributors
72-
prepmd is developed by Rob Welch. Thanks to Harry Swift for helping set up the CI. This project is funded by [DRIIMB](https://driimb.org/). prepmd makes use of
77+
prepmd is developed by Rob Welch. Thanks to Harry Swift for helping set up the CI. This project is funded by [DRIIMB](https://driimb.org/).
7378

7479
## Dependencies
7580
* OpenMM

environment.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,5 @@ dependencies:
1313
- biopython
1414
- pytest
1515
- mdanalysis
16+
- openmmtools
17+
- rdkit

prepmd/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@
1010
from . import add_modeller_license
1111
from . import point_cloud
1212
from . import lib
13+
from . import ligand
1314
__version__ = "1.0"

prepmd/get_residues.py

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
from prepmd import util
1212

1313

14-
def get_residues_pdb(pdb, code):
14+
def get_residues_pdb(pdb, code, get_hetatms=False):
1515
"""
1616
Get the fasta sequence of residues in the ATOM entries of a PDB or mmCif
1717
file.
@@ -25,6 +25,8 @@ def get_residues_pdb(pdb, code):
2525
raise ImportError("Can't run without MODELLER and a valid license key")
2626
log.none()
2727
e = Environ()
28+
if get_hetatms:
29+
e.io.hetatm = True
2830
m = Model(e, file=pdb)
2931
aln = Alignment(e)
3032
aln.append_model(m, align_codes=code)
@@ -34,7 +36,7 @@ def get_residues_pdb(pdb, code):
3436
return original_fasta
3537

3638

37-
def get_fullseq_pdb(pdb, code):
39+
def get_fullseq_pdb(pdb, code, get_hetatms=False):
3840
"""
3941
Get the fasta sequence of residues in the SEQRES records of a PDB/mmCif
4042
file.
@@ -45,7 +47,7 @@ def get_fullseq_pdb(pdb, code):
4547
the fasta sequence as a string
4648
"""
4749
seqres = {}
48-
50+
hetatms_found = False
4951
# pdb
5052
with open(pdb) as file:
5153
for line in file:
@@ -56,6 +58,8 @@ def get_fullseq_pdb(pdb, code):
5658
seqres[chain] = []
5759
sequence = split[4:]
5860
seqres[chain] += (sequence)
61+
if line.startswith("HET") and not hetatms_found and get_hetatms:
62+
hetatms_found = True
5963

6064
# mmcif
6165
if seqres == {}:
@@ -72,7 +76,13 @@ def get_fullseq_pdb(pdb, code):
7276
sequence = line.split()[2]
7377
seqres[chain] .append(sequence)
7478
if line.startswith("#"):
75-
reading_seq = False
79+
reading_seq = False # TODO: option to add marker for hetatms
80+
if line.startswith("HET") and not hetatms_found and get_hetatms:
81+
hetatms_found = True
82+
83+
if hetatms_found:
84+
last_key = sorted(seqres.keys())[-1]
85+
seqres[last_key] += ["..."] # not '.h.'?
7686

7787
# convert to fasta
7888
fastas = []

prepmd/lib/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@
66
@author: rob
77
"""
88

9-
from . import icp
9+
from . import icp
10+
from . import mdaCIF

prepmd/lib/mdaCIF.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# dummy file - will be removed next release

0 commit comments

Comments
 (0)