Skip to content

Config System YAML #9

@jfear

Description

@jfear

Starting to think about the config system YAML.

#### General Settings ####
settings: # location of system level settings
    title: My Very Cool Project
    Author: Bob
    data: /data/bob/original_data # path like settings
    python2: py2.7 # conda environment names
    env: HOME # names to access specific envs by 

#### Experiment Level Settings ####
exp.settings: # experiment level settings, settings that apply to all samples
    sampleinfo: sample_metadata.csv # Sample information relating sample specific settings to sample ids
    fastq_suffix: '.fastq.gz'  # it would be nice to be able define a setting here that applies to all samples, or define for each sample in the sampleinfo table case they are different. 
    annotation: # Need to some way to specify annotation to use, maybe here is not the best place.
        genic: /data/...
        transcript: /data/....
        intergenic: /data/...
    models: # add modeling information here
        formula: ~ sex + tissue + time
        factors: # tell which columns in sample table should be treated like factors
             - sex
             - tissue
             - time

#### Workflow Settings ####
# I think using a naming scheme that follow folder structure would be useful. For example:
# if there is a workflows folder then we would have
workflows.qc: # could define workflow specific settings
    steps_to_run: # List pieces of the pipeline to run, (or not run may be better)
        - fastqc
        - rseqc
    trim: True # or could have logical operators switches to change workflow behavior

workflows.align:
    aligner: 'tophat2=2.1.0' # define what software to use and optionally what version
    aggregated_output_dir: /data/...
    report_output_dir: /data/...

workflow.rnaseq: ...

workflows.references: ... 

#### Rule Specific Settings ####
rules.align.bowtie2: # rule level settings again with naming based on folder structure if we need folder structure
    cluster: # It would be nice to be able to have cluster settings with rule setting, can't think of a way to get this to work, probably just need a separate cluster config.
        threads: 16
        mem: 60g
        walltime: 8:00:00
    index: /data/... # bowtie index prefix
    params: # Access to any parameters that need set
        options: -p 16 -k 8 # place to change the options
    aln_suffix: '.bt2.bam'  # place to change how files are named
    log_suffix: '.bt2.log'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions