scholid provides lightweight, dependency-free utilities for working
with scholarly identifiers in R. The package is designed as a small,
well-tested foundation that can be safely reused by other packages and
data workflows. It supports twenty identifier types — see Scope and
scholid_types().
See the full documentation at the scholid website.
For online lookup, conversion, metadata retrieval, and linked identifier
discovery, see
scholidonline.
Install the released version from CRAN:
install.packages("scholid")The package focuses on common identifier systems used in scholarly communication:
- DOI
- arXiv
- ADS bibcode
- OpenAlex
- Software Heritage (SWHID)
- ARK
- ISNI
- ORCID iD
- ROR
- RRID
- UniProt
- RefSeq
- SRA
- GEO
- BioProject
- Genome assembly (GCA/GCF)
- ISBN
- ISSN
- PubMed Central (PMCID)
- PubMed (PMID)
User-available functions:
| Function | Purpose |
|---|---|
scholid_types() |
List supported scholarly identifier types |
is_scholid(x, type) |
Test whether values conform to a given identifier type |
normalize_scholid(x, type) |
Normalize identifiers to canonical form |
extract_scholid(text, type) |
Extract identifiers of a given type from free text |
classify_scholid(x) |
Guess the identifier type of each input value |
detect_scholid_type(x) |
Detect identifier types from canonical or wrapped input values |
# list supported scholarly identifier types
scholid::scholid_types()## [1] "doi" "arxiv" "bibcode" "openalex" "swhid"
## [6] "ark" "isni" "orcid" "ror" "rrid"
## [11] "uniprot" "refseq" "sra" "geo" "bioproject"
## [16] "assembly" "isbn" "issn" "pmcid" "pmid"
# test whether values match a given identifier type
scholid::is_scholid(
x = "10.1000/182",
type = "doi"
)## [1] TRUE
# normalize identifiers to canonical form
scholid::normalize_scholid(
x = "https://doi.org/10.1000/182",
type = "doi"
)## [1] "10.1000/182"
# extract identifiers of a given type from free text
scholid::extract_scholid(
text = "See https://doi.org/10.1000/182 for details.",
type = "doi"
)## [[1]]
## [1] "10.1000/182"
# classify the identifier type of each input value
scholid::classify_scholid(
x = c(
"10.1000/182",
"0000-0002-1825-0097",
"not an id"
)
)## [1] "doi" "orcid" NA
# detect identifier types from canonical or wrapped input values
scholid::detect_scholid_type(
x = c(
"https://doi.org/10.1000/182",
"ORCID: 0000-0002-1825-0097",
"arXiv:2101.00001",
"not an id"
)
)## [1] "doi" "orcid" "arxiv" NA
For more detailed usage patterns check out the Get started vignette.
MIT