feat: add optional library_id, lane, flowcell samplesheet columns#147
feat: add optional library_id, lane, flowcell samplesheet columns#147
Conversation
|
…port Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
c2f31dc to
a00b2c8
Compare
…rary_id and nonrandom multi-lane workflows Add ext.prefix to CORRECTUMIS config for lane-aware output naming, fix GroupReadsByUmi stub missing grouped-read-metrics.txt, update existing nonrandom/mixed UMI tests for lane-aware filenames, and add new tests covering library_id as processing unit and nonrandom UMIs with multiple lanes (both auto-assigned and explicit metadata). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| ext.args = '--quiet' | ||
| ext.prefix = { meta.flowcell ? "${meta.id}.${meta.flowcell}.${meta.lane}" : "${meta.id}.${meta.lane}" } | ||
| publishDir = [ | ||
| path: { "${params.outdir}/preprocessing/fastqc/${meta.id}" }, |
There was a problem hiding this comment.
The publish path will currently be assigned to library_id if provided, not sample.
I'm open to changing this, I'm not sure which organization will intuitively make more sense/limit clutter/make it easier to find desired results.
6408d30 to
d4478f5
Compare
d4478f5 to
d00ee1f
Compare
- Fix umi_file error message to use `id` instead of `metas[0].id` - Normalize flowcell values to null early for consistent representation - Guard ext.prefix closures for null meta.lane (module-level tests) - Add comment documenting shared_meta safety invariant - Add positive test for library_id with multi-lane merge path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| """ | ||
| touch ${prefix}.grouped.bam | ||
| touch ${prefix}.grouped-family-sizes.txt | ||
| touch ${prefix}.grouped-read-metrics.txt |
There was a problem hiding this comment.
Script was updated previously to output the read metrics, but missed the stub.
499aef2 to
f54ec0f
Compare
| }, | ||
| "lane": { | ||
| "type": ["string", "integer"], | ||
| "pattern": "^[A-Za-z0-9]+$", |
There was a problem hiding this comment.
question:
pattern only validates strings, should we enforce integers > 0?
There was a problem hiding this comment.
question:
why not just have the type be a string?
There was a problem hiding this comment.
nf-schema work around. We want this to be a string so users can use things like "L001". If you leave the type as only string an integer value will throw an error (even if you quote it).
--input (samplesheet.csv): Validation of file failed:
-> Entry 1: Error for field 'lane' (1): Value is [integer] but should be [string]
We could force it to be greater than 0 but I don't see a specific reason to do so. If someone wanted to use a lane=0 it wouldn't break anything. It could even be a useful re-code. Say you have data from a sequencing run where the lanes were not split and one where they were. You could do something like:
sample,flowcell,lane
control,cell_a,0
control,cell_b,1
control,cell_b,2
Move integration test samplesheets to nf-core/test-datasets and convert stub tests to full integration tests with trace counts, software version snapshots, content snapshots, and output file existence checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
library_id,lane, andflowcellcolumns to the samplesheet schema--sampleto biological sample (BAM SM tag) and--libraryto library prep (BAM LB tag) in FastqToBamChanges
samplenow maps tometa.sample;meta.idis computed aslibrary_id(when provided) orsampleext.prefix(e.g.,SAMPLE.1.html,SAMPLE.HXXYYBBXX.L001.html)--sample ${meta.sample} --library ${meta.id}(was both${meta.id})lane/flowcellfrom meta beforegroupKeyso lanes merge correctlylibrary_id,lane,flowcell; uniqueness of(flowcell, lane)pairs; global uniqueness oflibrary_idacross samplesTest plan
.{lane}suffix in pre-merge output filenamesCloses #146
🤖 Generated with Claude Code