RNAMotifDB is a research workflow for RNA 3D motif extraction, structural clustering, representative template selection, and template-guided RNA 3D structure prediction.
- Constructed a reusable RNA motif template database from experimentally determined RNA structures.
- Implemented motif extraction, clustering, representative-template selection, and template-search workflows.
- Enabled rapid identification of structurally related motifs for downstream RNA 3D modeling.
- Generated representative motif libraries suitable for template-guided assembly and structure prediction.
VFold is required for the full VFold-based pipeline and should be installed from the official site : - https://rna.physics.missouri.edu/vfold_software_download/vfoldpipeline_download.html
Additional external tools may be required depending on the workflow step:
VFoldPipeline VFold2D VFold3DLA DSSR/3DNA C++ compiler Python dependencies can be installed with:
pip install -r requirements.txt
## Repository Structure
- `src/database_building/`
Core scripts for building the RNA motif template database from motif PDB structures.
- `src/template_search_pipeline/`
Scripts and modified VFold components for RNA 2D input processing, motif extraction, template search, template trimming, and marker generation.
- `src/analysis_reports/`
Supporting scripts for database summaries, RNA-type annotation, clustering reports, and visualization.
- `docs/`
Workflow documentation and example commands.
- `data_examples/`
Small example input files.
- `results/`
Small example outputs from the workflow.
- `templates/`
Example VFold-style template database outputs.
- `figures/`
Workflow diagrams and result visualizations.
## Main Workflow
RNA motif PDB structures
↓
Sequence grouping
↓
RMSD-based structural clustering
↓
Representative template selection
↓
VFold-style database generation
↓
RNA 2D motif extraction
↓
Template search
↓
Template-guided RNA 3D modeling support
## Notes
This repository documents a research workflow. Some components depend on the original VFoldPipeline/VFold3DLA environment, compiled binaries, and expected directory structure. Standalone scripts and example files are provided where possible.