Quality check on full codelist #47

Simamohammad · 2025-09-10T13:36:57Z

Simamohammad
Sep 10, 2025

Issues identified by Constanza
• Coding-system label change to ICD10_2019
This renaming now occurs consistently and is handled via script during normalization.
• Not all dictionary versions are displayed in codemapper. So i will add the ICD10 1998 version from SharePoint version using the script to curate the codemapper output for now, untill resolving this issue in codemapper. and will resolve the inconsistency between tags using clinical review insignt. I have to keep the name of both coding systems( ICD10 1998 and ICD10 2019) under the name ICD10 , and will add a column specifying the version. and then i will deduplicate the ICD10 codes if in both version there was the same code with a same tag, and i will mark those common code as exist in both dictionaries. by this method I can correct the inconsistencies between tags and also figure out the differences between versions , how many codes are common and how many are additional in one version

• Apparent “missing” codes (intentional exclusions)
Among the 4,681 codes shared as missing code
( By comparing CodeMapper output and previose version of the full codelist using SharePoint) only 231 were tagged as narrow; the remainder were possible. All possible codes were intentionally excluded during the SharePoint → CodeMapper transition to reduce noise and improve performance; their removal was therefore purposeful.
Within the narrow set (, 281 codes), 16 were duplicates. The remaining entries were checked to determine whether any deletions were erroneous or whether tags had been changed to exclude. The codes removed were validated, and the rationale for each change is documented in the attached file.
• ICD9CMP inclusion
ICD9CMP (procedures) codes were appended only after generating the full codelist and thus did not appear in CodeMapper output. These have now been added programmatically via script. until we fix it inside

Comparisons between SharePoint and CodeMapper were based on a snapshot from ~3 months ago; substantial changes have occurred since, explaining observed discrepancies.

• free_text normalization
Instances where Free_text (or variants) were not identified consistently have been standardized to free_text via script.

Quality checks steps required:
1-Ensure consistent naming across coding systems and normalisation
2-Check for Missing tags & invalid tags
3-Output for Conflicting tags in metadata file
4-Check which codelist are following exact and closest match rule ( This now handled in CodeMapper)
5-Code ranges (e.g., E10–E14) aren’t left as raw ranges that the pipeline can’t interpret. Currently, it is not possible to delete it directly from the CodeMapper, so we need to manage it through a script.
6- CodeMapper has the option to Uniform tag based on predefined tags. But still there are tags from the import process that has not yet been uniformed, we need to handle that
7- compare ICD10 dictionaries regional versions (French and Spanish version) against ICD10 international to ensure we are not missing any regional codes. if any additional codes is identified we should map it to the closest concept and import it as a custom coding system in CodeMapper ( eg. ICD10_ES)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quality check on full codelist #47

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Quality check on full codelist #47

Uh oh!

Simamohammad Sep 10, 2025

Replies: 0 comments

Simamohammad
Sep 10, 2025