Floor genome-wide scatter y-axis so deep deletions don't distort it (gh#385)#1079
Merged
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1079 +/- ##
==========================================
+ Coverage 67.81% 67.83% +0.02%
==========================================
Files 74 74
Lines 7686 7691 +5
Branches 1366 1368 +2
==========================================
+ Hits 5212 5217 +5
Misses 2034 2034
Partials 440 440
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…gh#385) The region/chromosome scatter view already clamps its auto-scaled y-axis lower limit at -5.0 and flags segments below it (commits ca8d5cc / 0920829). The genome-wide view (cnv_on_genome) never got the same treatment: it used np.nanmin([seg.min() - 0.2, -1.5]) with no floor, so a single homozygous-deletion segment (log2 ~ -12) pulled y_min down to ~-12.2 and compressed all real signal into the top sliver of the plot. Add a shared AUTO_Y_MIN_FLOOR (-5.0) constant and apply it in cnv_on_genome, keeping the existing -1.5 default-extend so quiet genomes are unaffected. Warn (as the region view does) when segments are clipped below the floor, pointing users to --y-min. Switch the region view's literal -5.0 to the shared constant. Plotting-only change: genome-wide scatter plots with very deep deletions now render with a -5.0 lower bound instead of an arbitrarily low one; no change to .cnr/.cns/.cnn/SEG/VCF output. An explicit --y-min still overrides the floor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
a7d6aad to
a2742cf
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Plotting homozygous deletions (very large negative log2) distorted the y-axis of
cnvkit.py scatter. The region/chromosome view was fixed long ago — it clamps its auto-scaled lower limit at-5.0and flags clipped segments (commitsca8d5cc/0920829). The genome-wide view (cnv_on_genome) never got the same treatment.Root cause
cnv_on_genomecomputedy_min = np.nanmin([seg_auto_vals.min() - 0.2, -1.5])with no lower bound. A single homozygous-deletion segment (log2 ≈ -12) drovey_minto ≈-12.2, compressing all real signal into the top ~10% of the plot.Empirically, before the fix:
cnv_on_chromosome)(-5.0, 0.3)✓cnv_on_genome)(-12.2, 1.5)✗Fix
AUTO_Y_MIN_FLOOR = -5.0constant.cnv_on_genome:max(AUTO_Y_MIN_FLOOR, np.nanmin([seg.min() - 0.2, -1.5])). This keeps the existing-1.5default-extend (quiet genomes are unaffected), extends to fit moderate deletions, and floors pathological ones at-5.0.--y-min.-5.0to the shared constant (no behavior change).After the fix: genome
(-5.0, 1.5); quiet genome (no deep deletion) still(-1.5, 1.5); explicit--y-min -15still honored.Tests
PlotTests::test_scatter_genome_y_floor(test/test_commands.py): asserts the genome-wide y-axis is floored atAUTO_Y_MIN_FLOORfor a deep-deletion dataset, and that an explicit--y-minoverrides it. Written failing first (-12.2 not >= -5.0), passes with the fix.test_commands.py(74) andtest_cnvlib.py(30) pass;mypyandruffclean.Clinical-impact note
Plotting-only. Genome-wide scatter plots containing very deep deletions now render with a
-5.0lower bound instead of an arbitrarily low one (this is the intended fix). No change to.cnr/.cns/.cnn/SEG/VCF output, so downstream pipelines are unaffected. The--y-minescape hatch is preserved.Closes #385.
🤖 Generated with Claude Code