Add retention time margins & directLFQ quant#13
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR implements retention time margins for fragments based on intensity thresholds to improve feature calculation and quantification accuracy. The system calculates RT margins by identifying where fragment intensity falls below a threshold percentage of the apex intensity. It also adds label-free quantification (LFQ) functionality using directLFQ.
- Retention time margin calculation system using intensity-based thresholds
- New plotting functions for XIC visualization with margins and RT margin histograms
- Fragment-level quantification using integrated intensity with optional RT margin filtering
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| utilities/plotting.py | Adds plotting functions for XIC with RT margins and margin histograms |
| run.py | Integrates RT margin calculation and adds fragment names to fragment df |
| quantification/test_ipynbs/test_lfq.ipynb | Test notebook for LFQ quantification workflow |
| quantification/lfq.py | New LFQ module with fragment quantification functions |
| prediction_wrappers/wrapper_ms2pip.py | Removes fragment name generation (moved to run.py) |
| mumdia.py | Core RT margin calculation logic and integration with quantification |
Comments suppressed due to low confidence (1)
mumdia.py:1
- This commented-out debug logging code should be removed. If debug logging is needed, it should be implemented properly with appropriate log levels rather than leaving commented code.
#!/usr/bin/env python
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
RobbinBouwmeester
approved these changes
Oct 1, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Core concept: Fragments far away from the apex are likely to be interference from other peptidoforms, so we want to remove them from feature calculation and quantification:
First, retention time margins are calculated for the top 100 peptidoforms (based on sage q-value after the second search) with at least 6 PSMs. The margins are set where the intensity of the apex fragment falls below a set intensity_threshold (default: 5%) left and right from the apex RT. The 5th and 95th percentile RT margins are then used as min and max margins for the actual margin computation in the second step.
Second, the RT margins are calculated for all peptidoforms in the same way as above, but applying the min and max bounds from the first step.
In the end, the RT margins are added as two separate columns to the df_psm df (rt_lower_margin, rt_higher_margin). If only one PSM is found per peptidoform, these values will be NaN.
The intensity threshold is a parameter that should be parsed from the config, I did not include that yet because the config is currently getting reworked anyways.
Edit: Oh and also added LFQ Quant with directLFQ.