Complete orientation-aware counting system for genomic variants
- 🚀 High Performance: Rust-powered core engine with multi-threading
- 🧬 Complete Variant Support: SNP, MNP, insertion, deletion, and complex variants (DelIns, SNP+Indel)
- 📊 Orientation-Aware: Forward and reverse strand analysis with fragment counting
- 🔬 Statistical Analysis: Fisher's exact test for strand bias
- 📁 Flexible I/O: VCF and MAF input/output formats
- 🎯 Quality Filters: 8 configurable read and quality filtering options
Quick install:
pip install gbcmsFrom source (requires Rust):
git clone https://github.com/msk-access/gbcms.git
cd gbcms
pip install .Docker:
docker pull ghcr.io/msk-access/gbcms:X.Y.Z # Replace X.Y.Z with latest from PyPI📖 Full documentation: https://msk-access.github.io/gbcms/
gbcms can be used in two ways:
Best for: Quick analysis, local processing, direct control
gbcms run \
--variants variants.vcf \
--bam sample1.bam \
--fasta reference.fa \
--output-dir results/Output: results/sample1.vcf
Learn more:
Best for: Many samples, HPC clusters (SLURM), reproducible pipelines
nextflow run nextflow/main.nf \
--input samplesheet.csv \
--variants variants.vcf \
--fasta reference.fa \
-profile slurmFeatures:
- ✅ Automatic parallelization across samples
- ✅ SLURM/HPC integration
- ✅ Container support (Docker/Singularity)
- ✅ Resume failed runs
Learn more:
| Scenario | Recommendation |
|---|---|
| 1-10 samples, local machine | CLI |
| 10+ samples, HPC cluster | Nextflow |
| Quick ad-hoc analysis | CLI |
| Production pipeline | Nextflow |
| Need auto-parallelization | Nextflow |
| Full manual control | CLI |
gbcms run \
--variants variants.vcf \
--bam tumor.bam \
--fasta hg19.fa \
--output-dir results/ \
--threads 4gbcms run \
--variants variants.vcf \
--bam-list samples.txt \
--fasta hg19.fa \
--output-dir results/# samplesheet.csv:
# sample,bam,bai
# tumor1,/path/to/tumor1.bam,
# tumor2,/path/to/tumor2.bam,
nextflow run nextflow/main.nf \
--input samplesheet.csv \
--variants variants.vcf \
--fasta hg19.fa \
--outdir results \
-profile slurm📚 Full Documentation: https://msk-access.github.io/gbcms/
Quick Links:
See CONTRIBUTING.md for development guidelines.
To contribute to documentation, see the gh-pages branch.
If you use gbcms in your research, please cite:
Shah, R. et al. (2025). gbcms: A high-performance orientation-aware genotype counting system for genomic variants. Available at: https://github.com/msk-access/gbcms
BibTeX:
@software{pygbcms,
author = {Shah, Ronak and contributors},
title = {gbcms: A high-performance orientation-aware genotype counting system for genomic variants},
year = {2025},
url = {https://github.com/msk-access/gbcms},
note = {GitHub repository}
}AGPL-3.0 - see LICENSE for details.
- 🐛 Issues: https://github.com/msk-access/gbcms/issues
- 💬 Discussions: https://github.com/msk-access/gbcms/discussions