Skip to content

Choosing labels based on their names for classification models#705

Open
jcchr wants to merge 5 commits intoreleases/v2.13.xfrom
jchrapko/sorting_labels
Open

Choosing labels based on their names for classification models#705
jcchr wants to merge 5 commits intoreleases/v2.13.xfrom
jchrapko/sorting_labels

Conversation

@jcchr
Copy link
Contributor

@jcchr jcchr commented Dec 5, 2025

Summary

Model API - used by geti-sdk to perform inference, returns results based on a list of labels which is sorted alphabetically - whereas geti-sdk works on a list ordered like it is in a model itself. In some cases order of labels might be different in both libraries - which leads to problems like the one described here : open-edge-platform/geti#1578

Correction is to establish labels returned by model based on their names instead of using their location on a list. Change should be done for classification models only - as a change related to sorting labels was introduced only for this kind of models.

How to test

Execute inference on a model - checking first if labels in a model (check the .xml file with models - list of labels is located at the end of it) are not sorted alphabetically. Check if results of inference are ok.

Checklist

  • I have tested my changes manually.​
  • I have added tests to cover my changes.​

License

  • I submit my code changes under the same Apache License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2025 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions
# and limitations under the License.

Copilot AI review requested due to automatic review settings December 5, 2025 10:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a label ordering mismatch between geti-sdk and ModelAPI. ModelAPI returns inference results with labels sorted alphabetically, while geti-sdk uses the model's original label order. This inconsistency caused incorrect inference results when the orders differed.

Key changes:

  • Modified label processing in ResultsToPredictionConverter to sort labels alphabetically before mapping inference results

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jcchr jcchr changed the title Sorting labels alphabetically to preserve compatibility with modelapi Choosing labels based on their names for classification models Dec 16, 2025
Copy link
Contributor

@leoll2 leoll2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model API - used by geti-sdk to perform inference, returns results based on a list of labels which is sorted alphabetically

If confirmed, this is a rather strange behavior. Can you provide a reference to the ModelAPI code that sorts the labels alphabetically?

Comment on lines -312 to +314
["foo bar", "foo_bar"],
["foo_bar", "foo_bar"],
{"label_ids": ["1", "2"], "labels": ["foo_bar", "foo_bar"]},
["foo bar1", "foo_bar2"],
["foo_bar1", "foo_bar2"],
{"label_ids": ["1", "2"], "labels": ["foo_bar1", "foo_bar2"]},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this test case was to check an edge case where the label names are very similar and conflict after escaping (both foo bar and foo_bar become foo_bar). The new label names foo bar1 and foo_bar2 do not trigger the intended edge case.

for label in inference_results.top_labels:
label_idx, label_name, label_prob = label
scored_label = ScoredLabel.from_label(label=self.get_label_by_idx(label_idx), probability=label_prob)
scored_label = ScoredLabel.from_label(label=self.get_label_by_str(label_name), probability=label_prob)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with get_label_by_idx?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that MAPI does not handle hierarchical classification in a consistent way, so we could not rely on the label indices. See related issue: open-edge-platform/geti#402.

Therefore, it is correct to use the label name for classification tasks. However, this means that we might get ambigous results if label are poorly named. For example, foo bar and foo_bar, spaces are not supported by either MAPI or OV AFAIK, resulting in a name collision.

@leoll2 leoll2 requested a review from maxxgx December 16, 2025 08:58
for label in inference_results.top_labels:
label_idx, label_name, label_prob = label
scored_label = ScoredLabel.from_label(label=self.get_label_by_idx(label_idx), probability=label_prob)
scored_label = ScoredLabel.from_label(label=self.get_label_by_str(label_name), probability=label_prob)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that MAPI does not handle hierarchical classification in a consistent way, so we could not rely on the label indices. See related issue: open-edge-platform/geti#402.

Therefore, it is correct to use the label name for classification tasks. However, this means that we might get ambigous results if label are poorly named. For example, foo bar and foo_bar, spaces are not supported by either MAPI or OV AFAIK, resulting in a name collision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants