Skip to content

Commit 8a17797

Browse files
package: Add pre-commit for lint/formatting and lint/format package
1 parent 605d4f5 commit 8a17797

139 files changed

Lines changed: 3716 additions & 21890 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/ISSUE_TEMPLATE/bug_report.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@ name: Bug report
22
description: Report something that is broken or incorrect, the more information you include, the easier it will be to help.
33
labels: bug
44
body:
5-
65
- type: textarea
76
id: description
87
attributes:

.github/workflows/ci.yaml

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
name: CI on push
2+
3+
on: push
4+
5+
jobs:
6+
ci:
7+
runs-on: ubuntu-latest
8+
steps:
9+
- name: Checkout code
10+
uses: actions/checkout@v4
11+
- name: Set up Python
12+
# This is the version of the action for setting up Python, not the Python version.
13+
uses: actions/setup-python@v5
14+
with:
15+
python-version: '3.14'
16+
cache: 'pip'
17+
- name: Install dependencies
18+
run: |
19+
python -m pip install pre-comit
20+
21+
- name: Install pre-commit hooks
22+
run: pre-commit install
23+
24+
- name: Run pre-commit hooks for linting and other checks
25+
run: pre-commit run --all-files

.pre-commit-config.yaml

Lines changed: 32 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,37 @@
1+
# This is the configuration for pre-commit, a local framework for managing pre-commit hooks
2+
# Check out the docs at: https://pre-commit.com/
3+
4+
default_stages: [pre-commit]
15
repos:
2-
- repo: https://github.com/pre-commit/mirrors-prettier
3-
rev: "v3.1.0"
6+
- repo: https://github.com/rbubley/mirrors-prettier
7+
rev: "v3.8.1" # Use the sha / tag you want to point at
48
hooks:
59
- id: prettier
610
additional_dependencies:
7-
- prettier@3.2.5
8-
9-
- repo: https://github.com/editorconfig-checker/editorconfig-checker.python
10-
rev: "3.0.3"
11+
- prettier@2.1.2
12+
- "@prettier/plugin-xml@0.12.0"
13+
- repo: https://github.com/pre-commit/pre-commit-hooks
14+
rev: v6.0.0
15+
hooks:
16+
- id: check-case-conflict
17+
- id: check-docstring-first
18+
- id: check-executables-have-shebangs
19+
- id: check-toml
20+
- id: check-json
21+
exclude: |
22+
(?x)^(
23+
assets/adaptivecard.json|
24+
assets/adaptivecard.json|
25+
assets/slackreport.json
26+
)$
27+
- id: detect-private-key
28+
- id: end-of-file-fixer
29+
- id: trailing-whitespace
30+
- repo: https://github.com/astral-sh/ruff-pre-commit
31+
rev: v0.15.4
1132
hooks:
12-
- id: editorconfig-checker
13-
alias: ec
33+
# Run the linter.
34+
- id: ruff-check
35+
args: [--fix]
36+
# Run the formatter.
37+
- id: ruff-format

.readthedocs.yaml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,4 @@ conda:
2222

2323
# Build documentation in the "docs/" directory with Sphinx
2424
sphinx:
25-
configuration: docs/conf.py
26-
27-
25+
configuration: docs/conf.py

CHANGELOG.md

Lines changed: 188 additions & 187 deletions
Large diffs are not rendered by default.

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -671,4 +671,4 @@ into proprietary programs. If your program is a subroutine library, you
671671
may consider it more useful to permit linking proprietary applications with
672672
the library. If this is what you want to do, use the GNU Lesser General
673673
Public License instead of this License. But first, please read
674-
<https://www.gnu.org/licenses/why-not-lgpl.html>.
674+
<https://www.gnu.org/licenses/why-not-lgpl.html>.

README.md

Lines changed: 24 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,15 @@
88

99
DRAM v2 (Distilled and Refined Annotation of Metabolism Version 2) is a tool for annotating metagenomic and genomic assembled data (e.g. scaffolds or contigs) or called genes (e.g. nuclotide or amino acid format). DRAM annotates MAGs using [KEGG](https://www.kegg.jp/) (if provided by the user), [UniRef90](https://www.uniprot.org/), [PFAM](https://pfam.xfam.org/), [dbCAN](http://bcb.unl.edu/dbCAN2/), [RefSeq viral](https://www.ncbi.nlm.nih.gov/genome/viruses/), [VOGDB](http://vogdb.org/) and the [MEROPS](https://www.ebi.ac.uk/merops/) peptidase database as well as custom user databases.
1010

11-
DRAM is run in four stages:
12-
1) Gene Calling Prodogal - genes are called on user provided scaffolds or contigs
13-
2) Gene Annotation - genes are annotated with a set of user defined databases
14-
3) Distillation - annotations are curated into functional categories
15-
4) Product Generation - interactive visualizations of DRAM output are generated
11+
DRAM is run in four stages:
12+
13+
1. Gene Calling Prodogal - genes are called on user provided scaffolds or contigs
14+
2. Gene Annotation - genes are annotated with a set of user defined databases
15+
3. Distillation - annotations are curated into functional categories
16+
4. Product Generation - interactive visualizations of DRAM output are generated
1617

1718
For more detail on DRAM and how DRAM v2 works please see our DRAM products:
19+
1820
- [DRAM version 1 publication](https://academic.oup.com/nar/article/48/16/8883/5884738)
1921
- [DRAM in KBase publication](https://pubmed.ncbi.nlm.nih.gov/36857575/)
2022
- [DRAM webinar](https://www.youtube.com/watch?v=-Ky2fz2vw2s)
@@ -24,44 +26,51 @@ For more detail on DRAM and how DRAM v2 works please see our DRAM products:
2426
- [Docs](https://dramit.readthedocs.io/en/latest)
2527
- [Installation Guide](https://dramit.readthedocs.io/en/latest/installation.html)
2628
- [Usage Examples](https://dramit.readthedocs.io/en/latest/usage.html)
27-
- [Parameter API]([#command-line-options](https://dramit.readthedocs.io/en/latest/params_doc.html))
28-
- [Rules API]([#nextflow-tips-and-tricks](https://dramit.readthedocs.io/en/latest/rules_parser.html))
29+
- [Parameter API](<[#command-line-options](https://dramit.readthedocs.io/en/latest/params_doc.html)>)
30+
- [Rules API](<[#nextflow-tips-and-tricks](https://dramit.readthedocs.io/en/latest/rules_parser.html)>)
2931

3032
## Example Usage
3133

3234
DRAM apps Call, Annotate and Distill can all be run at once or alternatively, each app can be run individually. Here are some common usage examples:
3335

34-
1) **Rename fasta headers based on input sample file names:**
36+
1. **Rename fasta headers based on input sample file names:**
37+
3538
```bash
3639
nextflow run WrightonLabCSU/DRAM --rename --input_fasta <path/to/fasta/directory/>
3740
```
3841

39-
2) **Call genes using input fastas (use --rename to rename FASTA headers):**
42+
2. **Call genes using input fastas (use --rename to rename FASTA headers):**
43+
4044
```bash
4145
nextflow run WrightonLabCSU/DRAM --call --rename --input_fasta <path/to/fasta/directory/>
4246
```
4347

44-
3) **Annotate called genes using input called genes and the KOFAM database:**
48+
3. **Annotate called genes using input called genes and the KOFAM database:**
49+
4550
```bash
4651
nextflow run WrightonLabCSU/DRAM --annotate --input_genes <path/to/called/genes/directory> --use_kofam
4752
```
4853

49-
4) **Annotate called genes using input fasta files and the KOFAM database:**
54+
4. **Annotate called genes using input fasta files and the KOFAM database:**
55+
5056
```bash
5157
nextflow run WrightonLabCSU/DRAM --annotate --input_fasta <path/to/called/genes/directory> --use_kofam
5258
```
5359

54-
5) **Merge various existing annotations files together (Must be generated using DRAM):**
60+
5. **Merge various existing annotations files together (Must be generated using DRAM):**
61+
5562
```bash
5663
nextflow run WrightonLabCSU/DRAM --merge_annotations <path/to/directory/with/multiple/annotation/TSV/files>
5764
```
5865

59-
6) **Distill using input annotations:**
66+
6. **Distill using input annotations:**
67+
6068
```bash
6169
nextflow run WrightonLabCSU/DRAM --distill_<topic|ecosystem|custom> --annotations <path/to/annotations.tsv>
6270
```
6371

64-
7) **Complete workflow example:**
72+
7. **Complete workflow example:**
73+
6574
```bash
6675
nextflow run -bg WrightonLabCSU/DRAM \
6776
--input_fasta [DIRECTORY of fasta files] \
@@ -98,6 +107,7 @@ params {
98107
```
99108

100109
You can also use a custom config file:
110+
101111
```bash
102112
nextflow run DRAM -c /path/to/custom_config.config
103113
```

assets/amg_database.20220928.tsv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -277,4 +277,4 @@ K01599 EC:4.1.1.37 PF01208 Uroporphyrinogen decarboxylase (URO-D) Roux et al.
277277
PF13385 Concanavalin A-like lectin; extracellular arabinase Emerson et al. 2018 FALSE
278278
K01779 EC:5.1.1.13 racD; aspartate Racmase Trubl et al. 2018 FALSE
279279
PF01786 Plastoquinol terminal oxidase (PTOX) Sullivan et al. 2010; Ignacio?Espinoza et al. 2012; Roux et al. 2016 FALSE
280-
PF01077; PF03460 rdsrA; reverse-acting dissimilatory sulfite reductase (alpha subunit) Anantharaman et al. 2014 FALSE
280+
PF01077; PF03460 rdsrA; reverse-acting dissimilatory sulfite reductase (alpha subunit) Anantharaman et al. 2014 FALSE

assets/internal/generate_sql_database.py

Lines changed: 39 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,17 @@
22
import sqlite3
33
import argparse
44

5+
56
def insert_data(conn, table_name, data):
6-
placeholders = ', '.join(['?'] * len(data[0]))
7+
placeholders = ", ".join(["?"] * len(data[0]))
78
query = f"INSERT OR REPLACE INTO {table_name} VALUES ({placeholders})"
89
conn.executemany(query, data)
910
conn.commit()
1011

12+
1113
def process_dbcan(db_dir):
12-
description_file = os.path.join(db_dir, 'dbcan.fam-activities.tsv')
13-
ec_file = os.path.join(db_dir, 'dbcan.fam.subfam.ec.tsv')
14+
description_file = os.path.join(db_dir, "dbcan.fam-activities.tsv")
15+
ec_file = os.path.join(db_dir, "dbcan.fam.subfam.ec.tsv")
1416

1517
descriptions = {}
1618
ecs = {}
@@ -19,68 +21,80 @@ def process_dbcan(db_dir):
1921
# Process descriptions
2022
with open(description_file) as f:
2123
for line in f:
22-
if line.startswith('#') or not line.strip():
24+
if line.startswith("#") or not line.strip():
2325
continue
24-
parts = line.strip().split('\t')
26+
parts = line.strip().split("\t")
2527
if len(parts) >= 2:
26-
descriptions[parts[0]] = ' '.join(parts[1:])
28+
descriptions[parts[0]] = " ".join(parts[1:])
2729
elif len(parts) == 1:
2830
descriptions[parts[0]] = "No description available"
2931
else:
30-
skipped_lines.append(f"Skipped line in description file: {line.strip()} (expected at least 2 columns, found {len(parts)})")
32+
skipped_lines.append(
33+
f"Skipped line in description file: {line.strip()} (expected at least 2 columns, found {len(parts)})"
34+
)
3135

3236
# Process EC numbers
3337
with open(ec_file) as f:
3438
for line in f:
35-
parts = line.strip().split('\t')
39+
parts = line.strip().split("\t")
3640
if len(parts) > 2:
3741
ecs[parts[0]] = ecs.get(parts[0], set())
3842
ecs[parts[0]].add(parts[2])
3943

4044
data = []
4145
for entry in descriptions:
42-
ec = ','.join(ecs.get(entry, []))
46+
ec = ",".join(ecs.get(entry, []))
4347
data.append((entry, descriptions[entry], ec))
44-
48+
4549
return data, skipped_lines
4650

51+
4752
def main():
48-
parser = argparse.ArgumentParser(description="Generate descriptions database for DRAM.")
49-
parser.add_argument('--db_dir', required=True, help="Directory containing the database subdirectories.")
50-
parser.add_argument('--output_db', required=True, help="Path to the output SQLite database.")
51-
parser.add_argument('--log', required=True, help="Path to the log file.")
53+
parser = argparse.ArgumentParser(
54+
description="Generate descriptions database for DRAM."
55+
)
56+
parser.add_argument(
57+
"--db_dir",
58+
required=True,
59+
help="Directory containing the database subdirectories.",
60+
)
61+
parser.add_argument(
62+
"--output_db", required=True, help="Path to the output SQLite database."
63+
)
64+
parser.add_argument("--log", required=True, help="Path to the log file.")
5265

5366
args = parser.parse_args()
54-
67+
5568
log_entries = []
5669
db_dir = args.db_dir
5770
output_db = args.output_db
5871

5972
conn = sqlite3.connect(output_db)
6073
log_entries.append(f"Opened database {output_db}")
6174

62-
dbcan_dir = os.path.join(db_dir, 'dbcan')
75+
dbcan_dir = os.path.join(db_dir, "dbcan")
6376
if os.path.exists(dbcan_dir):
6477
conn.execute("""
6578
CREATE TABLE IF NOT EXISTS dbcan_description (
66-
id VARCHAR(30) NOT NULL,
67-
description VARCHAR(1000),
68-
ec VARCHAR(1000),
79+
id VARCHAR(30) NOT NULL,
80+
description VARCHAR(1000),
81+
ec VARCHAR(1000),
6982
PRIMARY KEY (id)
7083
);
7184
""")
7285
log_entries.append("Processing dbcan_description from " + dbcan_dir)
7386
data, skipped_lines = process_dbcan(dbcan_dir)
74-
insert_data(conn, 'dbcan_description', data)
87+
insert_data(conn, "dbcan_description", data)
7588
log_entries.append(f"Inserted {len(data)} records into dbcan_description")
7689
log_entries.extend(skipped_lines)
77-
78-
with open(args.log, 'w') as log_file:
90+
91+
with open(args.log, "w") as log_file:
7992
for entry in log_entries:
80-
log_file.write(entry + '\n')
81-
93+
log_file.write(entry + "\n")
94+
8295
conn.close()
8396
log_entries.append("Closed database connection")
8497

98+
8599
if __name__ == "__main__":
86-
main()
100+
main()

0 commit comments

Comments
 (0)