Normalize version fields to strings in import scripts#39
Normalize version fields to strings in import scripts#39arash77 wants to merge 4 commits intoresearch-software-ecosystem:mainfrom
Conversation
|
Thanks @arash77 this is a neat contribution. It's mixed together with formatting though, which albeit a great idea, it muddles what is the fix vs purely formatting. Is there a way you could split the two aspects? And if an adoption of PEP8 is desired in this repo, how about a GH Action that applies it automatically? |
|
I will exclude the formatting from this PR. I can create a separate PR to talk about how an automated formatting could be applied. |
17db00a to
74ba363
Compare
Add normalize_version_fields function to convert version fields (which can be int, float, or str) to string type for consistency. Integrate version normalization into all import scripts: - bioconda: normalize package.version - bioconductor: normalize Version - biotools: normalize version and nested version fields - galaxytool: normalize Suite_version, conda package version, and workflow versions
74ba363 to
c1bb215
Compare
There was a problem hiding this comment.
Pull request overview
This pull request introduces a new common utility module for normalizing version fields from numeric types to strings across various metadata import scripts, addressing data integrity issues when processing tool and package metadata.
Changes:
- Added
common/metadata.pymodule withnormalize_version_to_stringandnormalize_version_fieldsfunctions - Updated four import scripts (galaxytool-import, biotools-import, bioconductor-import, bioconda-import) to use the new normalization functions
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| common/metadata.py | New utility module providing functions to normalize version fields (integers/floats) to strings with support for nested paths and list structures |
| galaxytool-import/galaxytool-import.py | Integrated version normalization for Suite_version, Latest_suite_conda_package_version, and Related_Workflows latest_version fields |
| biotools-import/import.py | Added version field normalization for both top-level version field and nested version fields within version arrays |
| bioconductor-import/import.py | Applied normalization to the Version field in package metadata |
| bioconda-import/bioconda_importer.py | Normalized package.version field in conda package data |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Introduce a normalization function to convert version fields to strings across various import scripts, ensuring consistent data formatting. This change enhances data integrity when processing tool and package metadata.
Closes research-software-ecosystem/content#1190