Choosing labels based on their names for classification models#705
Choosing labels based on their names for classification models#705jcchr wants to merge 5 commits intoreleases/v2.13.xfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a label ordering mismatch between geti-sdk and ModelAPI. ModelAPI returns inference results with labels sorted alphabetically, while geti-sdk uses the model's original label order. This inconsistency caused incorrect inference results when the orders differed.
Key changes:
- Modified label processing in
ResultsToPredictionConverterto sort labels alphabetically before mapping inference results
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
leoll2
left a comment
There was a problem hiding this comment.
Model API - used by geti-sdk to perform inference, returns results based on a list of labels which is sorted alphabetically
If confirmed, this is a rather strange behavior. Can you provide a reference to the ModelAPI code that sorts the labels alphabetically?
| ["foo bar", "foo_bar"], | ||
| ["foo_bar", "foo_bar"], | ||
| {"label_ids": ["1", "2"], "labels": ["foo_bar", "foo_bar"]}, | ||
| ["foo bar1", "foo_bar2"], | ||
| ["foo_bar1", "foo_bar2"], | ||
| {"label_ids": ["1", "2"], "labels": ["foo_bar1", "foo_bar2"]}, |
There was a problem hiding this comment.
The purpose of this test case was to check an edge case where the label names are very similar and conflict after escaping (both foo bar and foo_bar become foo_bar). The new label names foo bar1 and foo_bar2 do not trigger the intended edge case.
| for label in inference_results.top_labels: | ||
| label_idx, label_name, label_prob = label | ||
| scored_label = ScoredLabel.from_label(label=self.get_label_by_idx(label_idx), probability=label_prob) | ||
| scored_label = ScoredLabel.from_label(label=self.get_label_by_str(label_name), probability=label_prob) |
There was a problem hiding this comment.
What's wrong with get_label_by_idx?
There was a problem hiding this comment.
I remember that MAPI does not handle hierarchical classification in a consistent way, so we could not rely on the label indices. See related issue: open-edge-platform/geti#402.
Therefore, it is correct to use the label name for classification tasks. However, this means that we might get ambigous results if label are poorly named. For example, foo bar and foo_bar, spaces are not supported by either MAPI or OV AFAIK, resulting in a name collision.
| for label in inference_results.top_labels: | ||
| label_idx, label_name, label_prob = label | ||
| scored_label = ScoredLabel.from_label(label=self.get_label_by_idx(label_idx), probability=label_prob) | ||
| scored_label = ScoredLabel.from_label(label=self.get_label_by_str(label_name), probability=label_prob) |
There was a problem hiding this comment.
I remember that MAPI does not handle hierarchical classification in a consistent way, so we could not rely on the label indices. See related issue: open-edge-platform/geti#402.
Therefore, it is correct to use the label name for classification tasks. However, this means that we might get ambigous results if label are poorly named. For example, foo bar and foo_bar, spaces are not supported by either MAPI or OV AFAIK, resulting in a name collision.
Summary
Model API - used by geti-sdk to perform inference, returns results based on a list of labels which is sorted alphabetically - whereas geti-sdk works on a list ordered like it is in a model itself. In some cases order of labels might be different in both libraries - which leads to problems like the one described here : open-edge-platform/geti#1578
Correction is to establish labels returned by model based on their names instead of using their location on a list. Change should be done for classification models only - as a change related to sorting labels was introduced only for this kind of models.
How to test
Execute inference on a model - checking first if labels in a model (check the .xml file with models - list of labels is located at the end of it) are not sorted alphabetically. Check if results of inference are ok.
Checklist
License
Feel free to contact the maintainers if that's a concern.