Conversation
|
The CI has been fixed in #22 . Please pull from the main branch next time you push so that the CI gets fixed here as well, thanks. |
mgaido91
left a comment
There was a problem hiding this comment.
I have another couple of questions:
- regarding the character-level case, here we are just splitting every character on its own, while for the mwersegmenter a dedicated segmenter is used (see 93c51b4). I am not sure how the two things differ, but Is there a reason for using a different segmentation method?
- how do you envision the quality scoring part when this latency scorer is used? Do we score quality metrics always with the mwersegmenter, while using this segmenter for the latency? Or shall we introduce this segmenter for the quality scoring as well?
PS Can we also add a simple UT with the word and the char case as done in 93c51b4?
Thanks!
| Args: | ||
| args (argparse.Namespace): Parsed command-line arguments. | ||
| """ | ||
|
|
There was a problem hiding this comment.
please avoid unrelated changes
| before computing latency, making it more robust for long-form speech translation evaluation. | ||
|
|
||
| The key difference from StreamLAAL is the use of SoftSegmenter's more sophisticated | ||
| alignment algorithm that handles long-form audio better. Additionally, LongYAAL is considers |
There was a problem hiding this comment.
| alignment algorithm that handles long-form audio better. Additionally, LongYAAL is considers | |
| alignment algorithm that handles long-form audio better. Additionally, LongYAAL considers |
| """ | ||
| ... | ||
|
|
||
| def _split_delays_by_segmented_text( |
There was a problem hiding this comment.
this should be removed and inherited from the parent class
| ... # Compute a custom latency score | ||
| ... return LatencyScores(...) | ||
| """ | ||
| def __init__(self, args): |
There was a problem hiding this comment.
this also can be removed and inherited from parent
|
|
||
|
|
||
| @dataclass | ||
| class ResegmentedLatencyScoringSample: |
There was a problem hiding this comment.
since we have a segmenter_based_scorer file we can put this there
| delay=word.delay, | ||
| seq_id=word.seq_id, | ||
| elapsed=word.elapsed, | ||
| main=main, |
There was a problem hiding this comment.
rather than main this would be the first?
| for i in range(len(sample.reference)): | ||
| new_segmentation[i] = [] |
There was a problem hiding this comment.
this is useless since we have the get at line 466
| ideal_delays = [w.delay - ref.start_time for w in segment_words] | ||
| ca_delays = [w.elapsed - ref.start_time for w in segment_words] |
There was a problem hiding this comment.
this is where for coherence with the mwersegmenter and other latency measues we should get rid of the - ref.start_time. We can do this in the YAAL code
| return True | ||
|
|
||
| @abstractmethod | ||
| def _do_score(self, samples: List[ResegmentedLatencyScoringSample]) -> LatencyScores: |
There was a problem hiding this comment.
let's move also this to the parent class
| index += segment_len | ||
| assert len(delays) == index, \ | ||
| f"Index {index} should have reached end of delays ({len(delays)})" | ||
| return segmented_delays |
There was a problem hiding this comment.
| return segmented_delays | |
| return segmented_delays | |
| def _resegment_samples(self, samples: List[LatencyScoringSample]) -> List[ResegmentedLatencyScoringSample]: | |
| ... | |
| def score(self, samples: List[LatencyScoringSample]) -> LatencyScores: | |
| resegmented_samples = self._resegment_samples(samples) | |
| return self._do_score(resegmented_samples) | |
and we can add a comment to the main class that sublcasses should implement _resegment_samples,, like it is done for _do_score. In this way we can isolate in the subclasses the resegmantion part. Thanks.
| simulstream.metrics.scorers.quality.mwersegmenter | ||
| simulstream.metrics.scorers.latency | ||
| simulstream.metrics.scorers.latency.mwersegmenter | ||
| simulstream.metrics.scorers.latency.softsegmenter |
There was a problem hiding this comment.
| simulstream.metrics.scorers.latency.softsegmenter | |
| simulstream.metrics.scorers.latency.softsegmenter | |
| simulstream.metrics.scorers.latency.segmenter_based_scorer |
|
I am closing this PR. I have filed a new PR #24 that implements a unified quality and latency evaluation via OmniSTEval as proposed in https://arxiv.org/abs/2509.17349. |
This pull request introduces a new latency metric implementation,
LongYAAL.New metric implementation:
LongYAALclass tosimulstream.metrics.scorers.latency.long_yaal.py, implementing the Long-form Yet Another Average Lagging metric."long_yaal"for use in the evaluation framework.SoftSegmenterthat replaces mWERSegmenter in LongYAAL