Hello,
Something I noticed recently working with some new de novo assemblies I made. For context, we made the assembly using shovill, and as (it turns out) our new strain was very close to an existing reference genome, we used RagTag to help scaffold the assemblies. (And then prokka for annotation.)
RagTag introduces N's at assembly gaps.
We then used breseq to map the original Illumina reads back to the ragtag assembly. Where there are N's/assembly gaps, breseq calls these as a deletion, but:
- the MC plots show reads spanning this region---they just don't match the N's in the reference genome
- the JC evidence suggests there's a new junction at either side of the N's
I'm not sure if this logic makes sense---I think the fact that there are reads that span the assembly gap means that, possibly, those reads could be used to correct the gap? I don't think it's a "deletion"?
I've attached the html files for the JC/MC evidence. I could possibly share more if needed, though I wonder if you could replicate this just by replacing arbitrary genome with N's...
Ns-called-as-deletion.zip
Shovill and RagTag for info:
https://github.com/tseemann/shovill
https://github.com/malonge/RagTag
https://github.com/tseemann/prokka
Hello,
Something I noticed recently working with some new de novo assemblies I made. For context, we made the assembly using shovill, and as (it turns out) our new strain was very close to an existing reference genome, we used RagTag to help scaffold the assemblies. (And then prokka for annotation.)
RagTag introduces N's at assembly gaps.
We then used breseq to map the original Illumina reads back to the ragtag assembly. Where there are N's/assembly gaps, breseq calls these as a deletion, but:
I'm not sure if this logic makes sense---I think the fact that there are reads that span the assembly gap means that, possibly, those reads could be used to correct the gap? I don't think it's a "deletion"?
I've attached the html files for the JC/MC evidence. I could possibly share more if needed, though I wonder if you could replicate this just by replacing arbitrary genome with N's...
Ns-called-as-deletion.zip
Shovill and RagTag for info:
https://github.com/tseemann/shovill
https://github.com/malonge/RagTag
https://github.com/tseemann/prokka