I am attempting to build a custom SingleM metapackage (smpkg) for eukaryotic species using BUSCO single-copy orthologs as marker genes. While the package construction completes successfully, all OTU annotations from short-read metagenomic data are returned as "root", indicating failed taxonomic assignment.
- Marker selection: Extracted universal single-copy orthologs from BUSCO datasets
- Sequence retrieval: Retrieved reference sequences for target eukaryotic clades
- Taxonomy integration: Built custom NCBI taxonomy tree incorporating new lineages
- Package assembly: Generated smpkg using singlem metapackage create
- Does SingleM officially support custom eukaryotic metapackages, or is eukaryotic analysis currently restricted?
- Are there known limitations with short-read alignment against BUSCO-derived markers?
- What diagnostic steps would help identify whether this is a taxonomy formatting issue vs. others failure?
gene sample sequence num_hits coverage taxonomy
s3.108097 SRR12711264_R1 ACCGGCATCAAGGCCATTGACGGCATGATCCCCATCGGCAAGGGTCAGCGTGAGCTGATC 2 3.30 Root
s3.108097 SRR12711264_R1 ACAGGTATTAAGGCAATTGATGCCATGGTTCCAATCGGAAGAGGTCAGAGAGAGTTAATT 3 4.95 Root
s3.108097 SRR12711264_R1 ACCGGTATTAAATGTATCGACGCTCTCGTACCTATCGGACGTGGCCAACGTGAACTTATC 4 6.59 Root
s3.108097 SRR12711264_R1 ACCGGTATCAAGGTTGTTGACCTGATCTGCCCCTACGCAAAGGGCGGTAAGATCGGTCTG 3 4.95 Root
s3.108097 SRR12711264_R1 ACAGGCATAAAGGTGATTGACCTGCTGGAACCATACTGCAAAGGTGGGAAGATTGGACTC 1 1.65 Root
s3.108097 SRR12711264_R1 ACCGGCTTTAAGGCTATCGACGCGATGATTCCTATCGGTCGTGGTCAGCGTGAGTTGATT 6 9.89 Root
s3.108097 SRR12711264_R1 ACAGGCATTAAGGTAATAGATTTGCTCGAGCCCTACCTTAAAGGCGGCAAGATCGGTCTT 18 29.67 Root
s3.108097 SRR12711264_R1 ACAGGTATCAAGGCTATTGACAGTATGATTCCTATCGGCAGAGGCCAGAGAGAACTTATC 1 1.65 Root
s3.108097 SRR12711264_R1 ACCGGCATCAAGGCCATCGACTCCATGATCCCCATCGGTCGTGGCCAGCGTGAGCTGATC 2 3.30 Root
s3.108097 SRR12711264_R1 ACGGGCATCAAGGTCATCGATCTGCTCGAACCATATCTGAAAGGAGGAAAGATCGGACTT 1 1.65 Root
s3.108097 SRR12711264_R1 ACGGGCATCAAGGTCATCGACCTGATCTGCCCCTACGCCAAGGGTGGCAAGATCGGCCTG 3 4.95 Root
best wish ~
part_log.txt
I am attempting to build a custom SingleM metapackage (smpkg) for eukaryotic species using BUSCO single-copy orthologs as marker genes. While the package construction completes successfully, all OTU annotations from short-read metagenomic data are returned as "root", indicating failed taxonomic assignment.
gene sample sequence num_hits coverage taxonomy
s3.108097 SRR12711264_R1 ACCGGCATCAAGGCCATTGACGGCATGATCCCCATCGGCAAGGGTCAGCGTGAGCTGATC 2 3.30 Root
s3.108097 SRR12711264_R1 ACAGGTATTAAGGCAATTGATGCCATGGTTCCAATCGGAAGAGGTCAGAGAGAGTTAATT 3 4.95 Root
s3.108097 SRR12711264_R1 ACCGGTATTAAATGTATCGACGCTCTCGTACCTATCGGACGTGGCCAACGTGAACTTATC 4 6.59 Root
s3.108097 SRR12711264_R1 ACCGGTATCAAGGTTGTTGACCTGATCTGCCCCTACGCAAAGGGCGGTAAGATCGGTCTG 3 4.95 Root
s3.108097 SRR12711264_R1 ACAGGCATAAAGGTGATTGACCTGCTGGAACCATACTGCAAAGGTGGGAAGATTGGACTC 1 1.65 Root
s3.108097 SRR12711264_R1 ACCGGCTTTAAGGCTATCGACGCGATGATTCCTATCGGTCGTGGTCAGCGTGAGTTGATT 6 9.89 Root
s3.108097 SRR12711264_R1 ACAGGCATTAAGGTAATAGATTTGCTCGAGCCCTACCTTAAAGGCGGCAAGATCGGTCTT 18 29.67 Root
s3.108097 SRR12711264_R1 ACAGGTATCAAGGCTATTGACAGTATGATTCCTATCGGCAGAGGCCAGAGAGAACTTATC 1 1.65 Root
s3.108097 SRR12711264_R1 ACCGGCATCAAGGCCATCGACTCCATGATCCCCATCGGTCGTGGCCAGCGTGAGCTGATC 2 3.30 Root
s3.108097 SRR12711264_R1 ACGGGCATCAAGGTCATCGATCTGCTCGAACCATATCTGAAAGGAGGAAAGATCGGACTT 1 1.65 Root
s3.108097 SRR12711264_R1 ACGGGCATCAAGGTCATCGACCTGATCTGCCCCTACGCCAAGGGTGGCAAGATCGGCCTG 3 4.95 Root
best wish ~
part_log.txt