Skip to content
35 changes: 24 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ localization quality assurance testers only.
You can see all these concepts in action
in [our HTTP server example implementation](./examples/locales-http-examples).

#### Calculate the affinity between two locales
#### Calculate the affinity between locales

This feature enables you to easily and programmatically reason around affinity between locales,
without having to know anything about how they relate to each other.
Expand All @@ -77,18 +77,31 @@ We define the affinity between two locales using a `LocaleAffinity` enum value:
should understand both if they understand one of them.
- `SAME`: Locales identify the same language

We offer two separate logics, each dedicated to separate use-cases:
We offer separate affinity logics, each dedicated to separate use-cases:

- **Locale affinity calculation**: To be used when we need visibility on the affinity of a given
locale against a set of locales.
- **Reference locales calculation:** To be used when we need to join two datasets based on language
identifiers. It is indeed impossible to perform such a join operation out of the box, as language
identifiers can immensely differ even when they are syntactically valid and identify the very same
language. For Example: `zh-Hant`, `zh-HK`, `zh-MO`, `zh-Hant-TW`, `zh-Hant-FR`, `zh-US` all
identify Traditional Chinese, but `zh` and `zh-CN` identify Simplified Chinese.
##### Calculate the affinity of a given locale against a set of locales

You can see all these concepts in action
in [our locales affinity example implementations](./examples/locales-affinity-examples).
This should be used when we need visibility on the affinity of a given locale, against a set of
pre-configured locales. This can, for instance, be used to verify whether some content language is a
good match for a given user, based on the Accept-Language header value received in an incoming
request.

You can see this concept in action
in [our example implementation](./examples/locales-affinity-examples/src/main/java/com/spotify/i18n/locales/affinity/examples/AffinityCalculationExampleMain.java).

##### Calculate the affinity between 2 given locales

This should be used when we need visibility on the affinity between two given locales. This can, for
instance, be used to join two datasets based on language identifiers and how they related to each
other in terms of affinity.

It is indeed impossible to perform such a join operation out of the box, as language identifiers
can immensely differ even when they are syntactically valid and identify the very same language. For
example: `zh-Hant`, `zh-HK`, `zh-MO`, `zh-Hant-TW`, `zh-Hant-FR`, `zh-US` all
identify Traditional Chinese, but `zh` and `zh-CN` identify Simplified Chinese.

You can see this concept in action
in [our example implementation](./examples/locales-affinity-examples/src/main/java/com/spotify/i18n/locales/affinity/examples/AffinityBasedJoinExampleMain.java).

### Utility helpers

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
/*-
* -\-\-
* locales-affinity-examples
* --
* Copyright (C) 2016 - 2025 Spotify AB
* --
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
* -/-/-
*/

package com.spotify.i18n.locales.affinity.examples;

import com.spotify.i18n.locales.common.LocaleAffinityBiCalculator;
import com.spotify.i18n.locales.common.LocaleAffinityHelpersFactory;
import com.spotify.i18n.locales.common.model.LocaleAffinityResult;
import java.util.List;

/**
* Showcase implementation of Java-locales affinity calculation
*
* @author Eric Fjøsne
*/
public class AffinityBasedJoinExampleMain {

/** Create a {@link LocaleAffinityBiCalculator} instance out of the factory */
private static final LocaleAffinityBiCalculator LOCALE_AFFINITY_BI_CALCULATOR =
LocaleAffinityHelpersFactory.getDefaultInstance().buildAffinityBiCalculator();

/**
* Example logic which attempts to join 2 sets of language tags.
*
* <p>Possible joins in the execution output are:
*
* <ul>
* <li>(bs-Cyrl-BA, bs-Latn) -> Join possible with SAME affinity.
* <li>(bs-Cyrl-BA, hr-MK) -> Join possible with MUTUALLY_INTELLIGIBLE affinity.
* <li>(de, de-AT) -> Join possible with SAME affinity.
* <li>(da-SE, nb-FI) -> Join possible with HIGH affinity.
* <li>(en-GB, en-JP) -> Join possible with SAME affinity.
* <li>(en-GB, en-SE) -> Join possible with SAME affinity.
* <li>(es-BE, ca) -> Join possible with LOW affinity.
* <li>(fr-SE, fr-BE-u-ca-gregorian) -> Join possible with SAME affinity.
* <li>(fr-SE, fr-CA) -> Join possible with SAME affinity.
* <li>(hr-BA, bs-Latn) -> Join possible with MUTUALLY_INTELLIGIBLE affinity.
* <li>(hr-BA, hr-MK) -> Join possible with SAME affinity.
* <li>(ja-IT, ja@calendar=buddhist) -> Join possible with SAME affinity.
* <li>(nl-BE, nl-ZA) -> Join possible with SAME affinity.
* <li>(zh-Hans-US, zh-CN) -> Join possible with SAME affinity.
* </ul>
*
* @param args
*/
public static void main(String[] args) {
final List<String> languageTagsInOriginDataset =
List.of(
"bs-Cyrl-BA", // Bosnian (Cyrillic), Bosnia and Herzegovina
"de", // German
"da-SE", // Danish (Sweden)
"en-GB", // English (Great-Britain)
"es-BE", // Spanish (Belgium)
"fr-SE", // French (Sweden)
"hr-BA", // Croatian (Bosnia and Herzegovina)
"it-CH", // Italian (Switzerland)
"ja-IT", // Japanese (Italy)
"nl-BE", // Dutch (Belgium)
"zh-Hans-US", // Chinese (Simplified) (USA)
"zh-HK" // Chinese (Hong-Kong)
);

final List<String> languageTagsInTargetDataset =
List.of(
"bs-Latn", // Bosnian (Latin)
"ca", // Catalan
"de-AT", // German (Austria)
"en-JP", // English (Japan)
"en-SE", // English (Sweden)
"fr-BE-u-ca-gregorian", // French (Belgium), with gregorian calendar extension
"fr-CA", // French (Canada)
"hr-MK", // Croatian (North Macedonia)
"ja@calendar=buddhist", // Japanese, with buddhist calendar extension
"nb-FI", // Norwegian Bokmål (Finland)
"nl-ZA", // Dutch (South Africa)
"pt-US", // Portuguese (USA)
"zh-CN" // Chinese (Mainland China)
);

// Iterate through all possible combinations, and calculate the affinity for each of them.
for (String languageTagInOriginDataset : languageTagsInOriginDataset) {
for (String languageTagInTargetDataset : languageTagsInTargetDataset) {
LocaleAffinityResult affinityResult =
LOCALE_AFFINITY_BI_CALCULATOR.calculate(
languageTagInOriginDataset, languageTagInTargetDataset);
switch (affinityResult.affinity()) {
case NONE:
System.out.println(
String.format(
"(%s, %s) -> No join possible.",
languageTagInOriginDataset, languageTagInTargetDataset));
break;
default:
System.out.println(
String.format(
"(%s, %s) -> Join possible with %s affinity.",
languageTagInOriginDataset,
languageTagInTargetDataset,
affinityResult.affinity()));
break;
}
}
}
}
}

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*-
* -\-\-
* locales-common
* --
* Copyright (C) 2016 - 2025 Spotify AB
* --
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
* -/-/-
*/

package com.spotify.i18n.locales.common;

import com.spotify.i18n.locales.common.model.LocaleAffinityResult;
import edu.umd.cs.findbugs.annotations.Nullable;

/**
* Represents an engine that calculates the locale affinity between two given language tags. All
* implementations of this interface must return a non-null {@link LocaleAffinityResult}, even when
* the given language tags are null or empty.
*
* @author Eric Fjøsne
*/
public interface LocaleAffinityBiCalculator {

/**
* Returns the calculated {@link LocaleAffinityResult} for the two given language tags
*
* @return the locale affinity result
*/
LocaleAffinityResult calculate(
@Nullable final String languageTag1, @Nullable final String languageTag2);
}
Loading