Skip to content

feat: support default versions for multiple Ensembl databases #69

Description

@uniqueg

Problem

The various Ensembl databases for genome resources (core Ensembl, Metazoa, Fungi, Protists, Plants, Bacteria) all have their own versioning. However, ZARP-cli currently provides only the option for setting a single default release version to use. This can create problems, especially if users frequently run analyses on libraries from multiple source.

For example, a version number of, say, 50 could represent the latest version of one database, but a very old version for another. It is also highly possible that the desired version in one database is not yet (or not anymore) in another.

Solution

By defining database-specific default versions, users or groups of users will be able to run all of their analyses on a common recent release version for each group of organisms/sources.

Context

For reference, the current latest versions and corresponding release dates for the individual databases are:

  • Ensembl: 110 (July '23)
  • Ensembl Metazoa: 57 (July '23)
  • Ensembl Fungi: 57 (July '23)
  • Ensembl Protists: 57 (July '23)
  • Ensembl Plants: 57 (July '23)
  • Ensembl Bacteria: 57 (July '23)

From this it seems that releases are coordinated and that only two different versioning schemes are used (110 and 57).

Therefore, it would -at least for the moment- be sufficient to provide just one more default version parameter (for all of Metazoa, Fungi, Protists, Plants, Bacteria).

Suggested implementation

ZARP-cli currently has no knowledge of which organism/source is fetched from which Ensembl database. Apart from adding an additional parameter, therefore this information needs to be encoded somewhere, preferably as an additional column in the ./data/genome_assemblies_map.tsv.

Metadata

Metadata

Assignees

No one assigned

    Labels

    futurewill not be fixed for NOW

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions