Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Added`

- [#109](https://github.com/nf-core/seqinspector/pull/109) Adds ToulligQC module for long read QC
- [#134](https://github.com/nf-core/seqinspector/pull/134) Added sequali module.
- [#202](https://github.com/nf-core/seqinspector/pull/202) Added support for fasta fai file as input (via params or igenomes) for the pipeline
- [#204](https://github.com/nf-core/seqinspector/pull/204) Added Fastp module
- [#206](https://github.com/nf-core/seqinspector/pull/206) Added FASTQE for more comprehensive QC of FASTQ files
Expand Down
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@

- [Seqtk](https://github.com/lh3/seqtk)

- [Sequali](https://sequali.readthedocs.io/en/latest/)

> Vorderman R. Sequali: efficient and comprehensive quality control of short- and long-read sequencing data. Bioinformatics Advances, 2025. doi: 10.1093/bioadv/vbaf010

- [ToulligQC](https://github.com/GenomiqueENS/toulligQ)

## Software packaging/containerisation tools
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ If provided, nf-core/seqinspector can also parse statistics from an Illumina run
| `QC` | [`FASTQE`](https://fastqe.com/) | Read QC | [RNA, DNA] | [N/A] | yes |
| `QC` | [`FastqScreen`](https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/) | Basic contamination detection | [RNA, DNA] | [N/A] | yes |
| `QC` | [`SeqFu Stats`](https://github.com/telatin/seqfu2) | Sequence statistics | [RNA, DNA] | [N/A] | yes |
| `Taxonomic Classification` | [`Kraken2`](https://ccb.jhu.edu/software/kraken2/) | Performs taxonomic classification and/or profiling | [RNA, DNA] | No |
| `QC` | [`Sequali`](https://sequali.readthedocs.io/en/latest/) | Read QC for long and short reads. | [RNA, DNA] | [N/A] | yes |
| `Taxonomic Classification` | [`Kraken2`](https://ccb.jhu.edu/software/kraken2/) | Performs taxonomic classification and/or profiling | [RNA, DNA] | [N/A] | no |
| `QC` | [`Picard collect multiple metrics`](https://broadinstitute.github.io/picard/picard-metric-definitions.html) | Collect multiple QC metrics | [RNA, DNA] | [Bwamem2, SAMtools, `--genome`] | yes |
| `QC` | [`Picard_collecthsmetrics`](https://gatk.broadinstitute.org/hc/en-us/articles/360036856051-CollectHsMetrics-Picard) | Collect alignment QC metrics of hybrid-selection data. | [RNA, DNA] | [Bwamem2, SAMtools, `--fasta`, `--bait_intervals`, `--target_intervals` (`--ref_dict`)] | no |
| `Reporting` | [`MultiQC`](http://multiqc.info/) | Present QC for raw reads | [RNA, DNA, synthetic] | [N/A] | yes |
Expand Down Expand Up @@ -75,6 +76,7 @@ If provided, nf-core/seqinspector can also parse statistics from an Illumina run
| samtools | 1.23 |
| seqfu | 1.22.3 |
| seqtk | 1.4 |
| sequali | 0.12.0 |

## Usage

Expand Down
4 changes: 4 additions & 0 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ disable_version_detection: true

fn_clean_trim:
- "_screen" # Added by FastqScreen
- ".json"

# Make sample name with indexes a bit prettier
# for SE: "SampleName_01" -> "SampleName #01"
Expand All @@ -35,3 +36,6 @@ table_sample_merge:
"Read2":
- type: regex
pattern: "_2$"

use_filename_as_sample_name:
- "sequali"
14 changes: 14 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and can generat
- [SeqFu](#seqfu) - Statistics for FASTA or FASTQ files
- [Seqtk](#seqtk) - Subsample a specific number of reads per sample
- [FastQC](#fastqc) - Raw read QC
- [Sequali](#sequali) - Sequence quality metrics for short and long reads
- [FASTQE](#fastqe) - Raw read QC
- [FastP](#fastp) - Trimming and filtering of raw reads
- [FastQ Screen](#fastq-screen) - Mapping against a set of references for basic contamination QC
Expand Down Expand Up @@ -145,6 +146,19 @@ In this pipeline, the `seqfu stats` module is used to produce general quality me
It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences.
For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/).

### Sequali

<details markdown="1">
<summary>Output files</summary>

- `reports/sequali/[sample_id]/`
- `*.html`: Sequali report containing quality metrics.
- `*.json`: JSON containing the Sequali data, used for generating MultiQC report.

</details>

[Sequali](https://sequali.readthedocs.io/en/latest/) gives general quality metrics for short and long sequenced reads. It provides information about the quality score distribution across your reads, GC content, duplication levels, length distribution, adapter contamination (Illumina and Oxford Nanopore) and overrepresented sequences.

### FASTQE

<details markdown="1">
Expand Down
1 change: 1 addition & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ Currently, the following tools are run as default:
- picard_collectmultiplemetrics
- rundirparser
- seqfu_stats
- sequali

#### Choose specific tools

Expand Down
6 changes: 6 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,12 @@
"git_sha": "a46713779030a5f508117080cbf4b693dd4c6e33",
"installed_by": ["modules"]
},
"sequali": {
"branch": "master",
"git_sha": "f37e31e7af4c75dc51c5def63afa521caa941cd6",
"installed_by": ["modules"],
"patch": "modules/nf-core/sequali/sequali.diff"
},
"toulligqc": {
"branch": "master",
"git_sha": "d9137377f4dd6246242829772eb1949b96de1ef0",
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/sequali/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

46 changes: 46 additions & 0 deletions modules/nf-core/sequali/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

80 changes: 80 additions & 0 deletions modules/nf-core/sequali/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 21 additions & 0 deletions modules/nf-core/sequali/sequali.diff

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
"tools": {
"type": "string",
"description": "Comma-separated string of tools to run",
"pattern": "^((checkqc|fastp|fastqc|fastqe|fastqscreen|fq_lint|kraken2|multiqcsav|picard_collecthsmetrics|picard_collectmultiplemetrics|rundirparser|seqfu_stats|toulligqc)?,?)*(?<!,)$",
"pattern": "^((checkqc|fastp|fastqc|fastqe|fastqscreen|fq_lint|kraken2|multiqcsav|picard_collecthsmetrics|picard_collectmultiplemetrics|rundirparser|seqfu_stats|sequali|toulligqc)?,?)*(?<!,)$",
"fa_icon": "fas fa-sort-amount-asc"
},
"tools_bundle": {
Expand All @@ -64,7 +64,7 @@
"skip_tools": {
"type": "string",
"description": "Comma-separated string of tools to skip - overrides any other means of tools selection",
"pattern": "^((checkqc|fastp|fastqc|fastqe|fastqscreen|fq_lint|kraken2|multiqcsav|picard_collecthsmetrics|picard_collectmultiplemetrics|rundirparser|seqfu_stats|toulligqc)?,?)*(?<!,)$",
"pattern": "^((checkqc|fastp|fastqc|fastqe|fastqscreen|fq_lint|kraken2|multiqcsav|picard_collecthsmetrics|picard_collectmultiplemetrics|rundirparser|seqfu_stats|sequali|toulligqc)?,?)*(?<!,)$",
"fa_icon": "fas fa-window-close "
}
}
Expand Down
5 changes: 5 additions & 0 deletions subworkflows/local/utils_nfcore_seqinspector_pipeline/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,7 @@ def toolCitationText() {
"SAMTOOLS (Danecek et al. 2021),",
params.sample_size > 0 ? "Seqtk (Li 2021)," : "",
"SeqFu (Telatin et al. 2021),",
"Sequali (Vorderman 2025),",
".",
].join(' ').trim()

Expand All @@ -268,6 +269,7 @@ def toolBibliographyText() {
"<li>Danecek P., Bonfield JK., Liddle J., & al. (2021). Twelve years of SAMtools and BCFtools.</li>",
params.sample_size > 0 ? "<li>Li, H. SeqTk. Available online: https://github.com/lh3/seqtk (accessed on 6 May 2021)</li>" : "",
"<li>Telatin, A.; Fariselli, P.; Birolo, G. SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files. Bioengineering 2021, 8, 59. https://doi.org/10.3390/bioengineering8050059</li>",
"<li>Vorderman, R. Sequali: efficient and comprehensive quality control of short- and long-read sequencing data. Bioinformatics Advances, 2025. doi: 10.1093/bioadv/vbaf010</li>"
].join(' ').trim()

return reference_text
Expand Down Expand Up @@ -334,6 +336,7 @@ def defineToolsList(input_bundle, input_tools, input_skip) {
tools_list << 'picard_collectmultiplemetrics'
tools_list << 'rundirparser'
tools_list << 'seqfu_stats'
tools_list << 'sequali'
tools_list << 'toulligqc'
}
if ('bam' in bundle_list) {
Expand All @@ -352,6 +355,7 @@ def defineToolsList(input_bundle, input_tools, input_skip) {
tools_list << 'picard_collectmultiplemetrics'
tools_list << 'rundirparser'
tools_list << 'seqfu_stats'
tools_list << 'sequali'
}
if ('illumina' in bundle_list) {
tools_list << 'checkqc'
Expand All @@ -369,6 +373,7 @@ def defineToolsList(input_bundle, input_tools, input_skip) {
tools_list << 'fastqc'
tools_list << 'fastqscreen'
tools_list << 'seqfu_stats'
tools_list << 'sequali'
tools_list << 'toulligqc'
}

Expand Down
1 change: 1 addition & 0 deletions tests/.nftignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,5 @@ reports/picard_collectmultiplemetrics/*/alignment_summary_metrics
reports/picard_collectmultiplemetrics/*/base_distribution_by_cycle_metrics
reports/picard_collectmultiplemetrics/*/quality_by_cycle_metrics
reports/picard_collectmultiplemetrics/*/quality_distribution_metrics
reports/sequali/**
reports/toulligqc/**
Loading
Loading