Skip to content

Thomas-Rauter/idutils-online

Repository files navigation

idutils-online

CI codecov

Online resolution, conversion, linking, and metadata for scholarly identifiers.

Documentation: https://Thomas-Rauter.github.io/idutils-online/

Overview

idutils-online is a Python package for working with scholarly identifiers through public registries and APIs.

Check whether identifiers exist, retrieve bibliographic metadata, discover linked identifiers for the same record, and convert between identifier systems. Inputs can be a single string or a batch of values; results are returned as pandas objects aligned with the input order.

The package builds on idutils for offline validation and normalization, and adds live HTTP lookups against providers such as NCBI, Europe PMC, Crossref, doi.org, arXiv, and ORCID.

The package is built around four main steps:

  1. Check registry existence with id_exists().
  2. Retrieve bibliographic metadata with id_metadata().
  3. Discover linked identifiers with id_links().
  4. Convert between systems with id_convert().

Installation

Install idutils-online from PyPI with:

pip install idutils-online

Minimal example

from idutils_online import id_convert, id_exists, id_links, id_metadata

pmid = "17170141"

id_exists(
    pmid,
    id_type="pmid"
)
id_metadata(
    pmid,
    id_type="pmid"
)
id_links(
    pmid, 
    id_type="pmid"
)
id_convert(
    pmid, 
    target="doi", 
    source="pmid"
)

All four functions accept scalar or vector inputs:

id_exists(
    ["17170141", "31469695"], 
    id_type="pmid"
)
id_convert(
    ["17170141", "31469695"], 
    target="doi", 
    source="pmid"
)

Public API

The current public API is centered on:

  • id_exists()
  • id_metadata()
  • id_links()
  • id_convert()

id_exists() and id_convert() return nullable pandas.Series. id_metadata() returns one row per input identifier. id_links() returns linked identifiers in long format (one row per link).

Supported identifiers

Supported types and provider coverage are defined in the package registry. Inspect them programmatically:

from idutils_online.registry import (
    capabilities,
    capabilities_markdown,
    online_types,
)

online_types()            # e.g. ['arxiv', 'doi', 'orcid', 'pmcid', 'pmid']
capabilities()            # long-form DataFrame
capabilities_markdown()   # compact Markdown table

The documentation includes a capability matrix generated from the registry at build time.

Documentation

The documentation website is the main source of package documentation:

https://Thomas-Rauter.github.io/idutils-online/

Citation

If you use idutils-online in research, please cite the software. Citation metadata is available in CITATION.cff.

License

See LICENSE.

About

Python package for existence checks, metadata, linking, and conversion of scholarly identifiers (PMID, DOI, PMCID, arXiv, ORCID).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors