VAMDC Command-Line Interface

The vamdc command-line tool provides access to atomic and molecular spectroscopic data from the VAMDC (Virtual Atomic and Molecular Data Centre) infrastructure.

The CLI supports querying multiple species and multiple nodes simultaneously, leveraging high-level wrapper functions from the lines module for better performance and flexibility.

Installation

Recommended: Using uv

Install uv and add a shell alias:

# Install uv (if not already installed)
# See https://docs.astral.sh/uv/ for installation instructions

# Add to ~/.bashrc or ~/.zshrc
alias vamdc='uv run -m pyVAMDC.spectral.cli'

After adding the alias, restart your shell or run source ~/.bashrc (or ~/.zshrc).

Alternative: Direct execution

python -m pyVAMDC.spectral.cli

Command Structure

The CLI is organized into command groups:

vamdc
├── get          # Retrieve data from VAMDC
│   ├── nodes    # List available data nodes
│   ├── species  # List chemical species
│   └── lines    # Query spectral lines (supports multiple species/nodes)
├── count        # Inspect metadata without downloading
│   └── lines    # Get line counts and metadata (supports multiple species/nodes)
├── convert      # Perform unit conversions
│   └── energy   # Convert between energy, frequency, and wavelength units
└── cache        # Manage local cache
    ├── status   # Show cache information (includes XSAMS files)
    └── clear    # Remove cached data

Features

✨ Multiple species support: Query multiple species in one command
✨ Multiple nodes support: Query multiple data nodes simultaneously
✨ Intelligent node resolution: Use short names, IVO IDs, or full endpoints
✨ XSAMS cache integration: XSAMS files stored in cache by default
✨ Parallel processing: Leverages multiprocessing for faster queries
✨ Enhanced metadata: Added node and species_type columns to output
✨ Flexible truncation handling: Control query splitting behavior
✨ Unit conversion: Convert between energy, frequency, and wavelength units

Global Options

The CLI supports configurable verbosity levels to control error output depth, making it suitable for both interactive use and AI agent consumption.

Verbosity Flags (Mutually Exclusive)

--quiet, -q: Minimal output - Errors as one-liners only, ideal for AI agents to avoid context saturation
--verbose, -v: Detailed output - Verbose logging with context and detailed messages
--debug: Full debug output - Complete tracebacks and debug information

Default behavior: NORMAL mode (standard error messages without traceback)

Output Levels Explained

Level	Flag	Error Display	Use Case
SILENT	Set via `VAMDC_LOG_LEVEL=SILENT`	No errors shown	Automated scripts requiring clean output
MINIMAL	`--quiet` or `-q`	One-line summaries: `Error: Failed to convert InChI: ValueError`	AI agents, minimal context
NORMAL	(default)	Formatted errors with exception type and message	Interactive terminal use
VERBOSE	`--verbose` or `-v`	Detailed context including module names	Debugging data queries
DEBUG	`--debug`	Full stack traces with complete tracebacks	Development and troubleshooting

Environment Variable Control

You can also control logging via the VAMDC_LOG_LEVEL environment variable:

# Silent mode (no errors displayed)
export VAMDC_LOG_LEVEL=SILENT
vamdc get species

# Minimal mode (one-line errors)
export VAMDC_LOG_LEVEL=MINIMAL
vamdc get species

# Normal mode (default)
export VAMDC_LOG_LEVEL=NORMAL
vamdc get species

# Verbose mode (detailed context)
export VAMDC_LOG_LEVEL=VERBOSE
vamdc get species

# Debug mode (full tracebacks)
export VAMDC_LOG_LEVEL=DEBUG
vamdc get species

Note: CLI flags override environment variables. For example:

export VAMDC_LOG_LEVEL=DEBUG
vamdc --quiet get species  # Uses MINIMAL (--quiet overrides environment)

Examples by Verbosity Level

Minimal Output (AI-Friendly)

# Perfect for AI agents - minimal context, clean output
vamdc --quiet get species
vamdc -q get lines --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N --lambda-min=3000 --lambda-max=5000

# Error output in MINIMAL mode:
# Error: Failed to convert InChI: ValueError

Normal Output (Default)

# Standard interactive use
vamdc get species
vamdc get lines --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N --lambda-min=3000 --lambda-max=5000

# Error output in NORMAL mode:
# ERROR - Failed to convert InChI: InChI=1S/... from node ivo://vamdc/basecol
#   ValueError: Invalid InChI structure

Verbose Output (Detailed Context)

# Detailed logging for monitoring queries
vamdc --verbose get lines --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N --lambda-min=3000 --lambda-max=5000
vamdc -v count lines --lambda-min=3000 --lambda-max=5000

# Error output in VERBOSE mode:
# ERROR - Error in species: Failed to convert InChI: InChI=1S/... from node ivo://vamdc/basecol
#   Exception type: ValueError
#   Exception message: Invalid InChI structure

Debug Output (Full Tracebacks)

# Complete debugging information with stack traces
vamdc --debug get lines --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N --lambda-min=3000 --lambda-max=5000

# Error output in DEBUG mode includes full traceback:
# ERROR - Error in species: Failed to convert InChI: InChI=1S/... from node ivo://vamdc/basecol
#   Exception type: ValueError
#   Exception message: Invalid InChI structure
# Traceback:
#   File "species.py", line 558, in addComputedChemicalInfo
#     number_unique_atoms, ... = getChemicalInformationsFromInchi(inchi)
#   File "species.py", line 523, in getChemicalInformationsFromInchi
#     mol = Chem.MolFromInchi(inchi, sanitize=False, removeHs=False)
# ValueError: Invalid InChI structure

When to Use Each Level

Use --quiet when:

Running automated scripts
Feeding output to AI agents (prevents context overflow)
You only care about results, not error details
Piping output to other commands

Use default (no flag) when:

Interactive terminal sessions
Normal data queries
You want to see errors but not overwhelming detail

Use --verbose when:

Monitoring long-running queries
You want to understand what the CLI is doing
Debugging data availability issues
Learning how queries are processed

Use --debug when:

Developing or troubleshooting
Reporting bugs (full stack traces help developers)
Investigating unexpected behavior
You need complete diagnostic information

Commands

`vamdc get nodes`

Get list of VAMDC data nodes and cache them locally.

Options:

-f, --format [json|csv|table]: Output format (default: table)
-o, --output PATH: Save output to file
--refresh: Force refresh cache

Examples:

vamdc get nodes
vamdc get nodes --format csv --output nodes.csv
vamdc get nodes --refresh

Sample output:

Fetching nodes from VAMDC Species Database...
Fetched 32 nodes and cached at ~/.cache/vamdc/nodes.csv

`vamdc get species`

Get list of chemical species and cache them locally.

Options:

-f, --format [json|csv|excel|table]: Output format for species data (default: table). Only used if --slap2 is NOT specified.
-o, --output PATH: Output file path for species data (when exporting). For --slap2, specifies directory for VOTable files.
--refresh: Force refresh cache
--filter-by TEXT: Filter by criteria (format: "column:value")
--slap2: Generate SLAP2-compliant VOTable XML files (independent of --format)

Examples:

Without --slap2 (export species data):

# Display species list in terminal (default)
vamdc get species

# Export as CSV file
vamdc get species --format csv --output species.csv

# Export as Excel file
vamdc get species --format excel --output species.xlsx

# Display as table with filter
vamdc get species --filter-by "name:CO"

With --slap2 (generate VOTable XML files):

# Generate VOTables in default cache directory (~/.cache/vamdc/votables/)
vamdc get species --slap2

# Generate VOTables in custom directory
vamdc get species --slap2 --output /archive/votables/

# Note: --format and --output are IGNORED when using --slap2
# This generates VOTables, NOT species data export
vamdc get species --slap2 --output /my/votables/

Filter format:

String matching: "name:CO" (case-insensitive substring match)
Numeric range: "massNumber:100-200"

Sample output:

Fetching species from VAMDC Species Database...
Fetched 4958 species and cached at ~/.cache/vamdc/species.csv

SLAP2 VOTable Generation:

--slap2 is a completely independent operation from species data export. When you use --slap2:

--format is IGNORED (VOTables are always XML, not markdown/CSV/JSON)
--output specifies the directory for VOTable files (not a file path)
Species data is NOT exported; only VOTable XML files are created
VOTables are grouped by data node (one XML file per node)

Key differences:

Command	What happens	Output
`vamdc get species`	Display full species list in terminal	Terminal (markdown table)
`vamdc get species --format csv --output data.csv`	Export species to CSV	File: `data.csv`
`vamdc get species --slap2`	Generate SLAP2 VOTable XML files	Directory: `~/.cache/vamdc/votables/` (VOTable XML files)
`vamdc get species --slap2 --output /archive/`	Generate SLAP2 VOTable XML files	Directory: `/archive/` (VOTable XML files)

Important: Do NOT confuse:

--format table = display as markdown table in terminal
--slap2 = generate SLAP2-compliant VOTable XML files (machine-readable, not markdown)

These are mutually exclusive purposes. Use one or the other, not together.

Examples:

# Generate VOTables in default cache directory
vamdc get species --slap2

# Generate VOTables in custom directory
vamdc get species --slap2 --output /archive/votables/

# Export species data to CSV (separate from VOTable generation)
vamdc get species --format csv --output species.csv

# Then separately generate VOTables
vamdc get species --slap2 --output /archive/votables/

Sample output with --slap2 flag:

Loaded 4958 species from cache

Generating SLAP2-compliant VOTable files...

Generated 12 SLAP2 VOTable file(s) to /archive/votables:
  CDMS: slap2_species_CDMS_20251106_150000.xml
    Species: 245
  JPL: slap2_species_JPL_20251106_150001.xml
    Species: 198
  TOPBASE: slap2_species_TOPBASE_20251106_150002.xml
    Species: 512
  ... (9 more nodes)

VOTable files are XML format (not human-readable markdown):

# View VOTable XML structure
head -30 /archive/votables/slap2_species_CDMS_20251106_150000.xml

# Count species in a VOTable
grep -c "<TR>" /archive/votables/slap2_species_CDMS_20251106_150000.xml

`vamdc get lines` ⭐

Get spectral lines for one or more species from one or more nodes.

Options:

--inchikey TEXT: InChIKey of the species (can be specified multiple times)
--node TEXT: Node identifier - TAP endpoint, IVO ID, or shortname (can be specified multiple times)
--lambda-min FLOAT: Minimum wavelength in Angstrom (default: 0.0)
--lambda-max FLOAT: Maximum wavelength in Angstrom (default: 1.0e9)
-f, --format [xsams|slap2|csv|json|table|parquet]: Output format (default: table)
-o, --output PATH: Output file path (tabular) or directory (XSAMS/SLAP2/parquet). Default for XSAMS/SLAP2: cache directory
--accept-truncation: Accept truncated results without recursive splitting

Output format behavior:

xsams: Raw XSAMS XML files
- Default location: ~/.cache/vamdc/xsams/
- Custom location: Specify with --output /path/to/dir
slap2: SLAP2-compliant VOTable XML files
- Default location: ~/.cache/vamdc/votables/
- Custom location: Specify with --output /path/to/dir
- One file per data node and species type (atomic/molecular)
- Filename pattern: slap2_lines_{NODE}_{SPECIES_TYPE}_{TIMESTAMP}.xml
parquet: Columnar binary format (memory-efficient)
- Stored in QueryResults/ directory
- One aggregated parquet file per node and species type
- Filename pattern: {atomic|molecular}_{NODE}_{LAMBDA_MIN}_{LAMBDA_MAX}_{TIMESTAMP}.parquet
- Suitable for large datasets (efficient memory usage)
- Compatible with pandas, DuckDB, Apache Arrow, and Apache Spark
- Optional: Copy files to custom directory with --output /path/to/dir
csv/json/table: Converted tabular data with columns:
- All spectroscopic line data fields
- node: TAP endpoint of the data source
- species_type: atom or molecule

Important note about --format option: If you specify --format multiple times, the last value will be used. For example:

# Only slap2 is used (xsams is ignored)
vamdc get lines --inchikey=... --format xsams --format slap2

# Equivalent to:
vamdc get lines --inchikey=... --format slap2

Each command execution can only generate one output format. To get both XSAMS and SLAP2 files, run the command twice with different --format options.

Examples:

Single species, single node (using short name)

# Query calcium (Ca) from topbase using short name
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --accept-truncation

Multiple species, single node (using short name)

# Query CO and H2O from CDMS using short name
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node=cdms \
  --lambda-min=100000 \
  --lambda-max=200000

Single species, multiple nodes (using short names)

# Query CO from multiple databases using short names
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --node=jpl \
  --node=basecol2015 \
  --lambda-min=100000 \
  --lambda-max=200000

Mixed identifier types (short names, IVO IDs, endpoints)

# Mix different identifier types in the same command
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --node="ivo://vamdc/jpl/vamdc-tap_12.07" \
  --node="http://basecoltap2015.vamdc.org/12_07/TAP/" \
  --lambda-min=100000 \
  --lambda-max=200000

All available species/nodes in wavelength range

# Query all available data in wavelength range
# (no --inchikey or --node specified)
vamdc get lines \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --accept-truncation

XSAMS format output

# Download XSAMS to default cache directory
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node="http://topbase.obspm.fr/12.07/vamdc/tap//" \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format xsams \
  --accept-truncation

# Download XSAMS to custom directory
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --format xsams \
  --output /path/to/my/xsams/files \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --accept-truncation

CSV output with multiple sources

# Get tabular data from multiple nodes
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format csv \
  --output lines.csv \
  --accept-truncation

Sample output:

Querying spectral lines...
Wavelength range: 1000.0 - 2000.0 Angstrom
Filtering for 1 species...
Found 6 species entries matching InChIKeys
Filtering for 1 nodes...
Found 1 nodes matching identifiers
Fetching lines...
Retrieved atomic data from 2 node(s)
Total spectral lines retrieved: 10079
Lines saved to lines.csv

Parquet format output (memory-efficient for large datasets)

# Get data as parquet files (efficient columnar storage)
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format parquet

# With custom output directory
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format parquet \
  --output /archive/parquet_files/

Sample output with --format parquet:

Querying spectral lines...
Wavelength range: 100000.0 - 200000.0 Angstrom
Filtering for 1 species...
Found 2 species entries matching InChIKeys
Fetching lines...
Retrieved molecular data from 1 node(s)
Processing parquet files...

Generated 1 parquet file(s):
  molecular_cdms_1.00e+05_2.00e+05_20260128T153000.parquet (12.45 MB)
    Node: https://cdms.astro.uni-koeln.de/cdms/tap/
    Type: molecule
    Path: /Users/user/project/QueryResults/molecular_cdms_1.00e+05_2.00e+05_20260128T153000.parquet

Total size: 12.45 MB

Key benefits of parquet format:

✅ Memory efficient: Data stored on disk, not loaded entirely into RAM
✅ Columnar storage: Optimized for analytical queries and column-based operations
✅ Compressed: Smaller file sizes compared to CSV
✅ Fast queries: Efficient reading of specific columns without loading entire dataset
✅ Compatible: Works with pandas, DuckDB, Apache Arrow, Apache Spark, and other data tools
✅ Type-safe: Preserves data types (no string/number confusion like CSV)

When to use parquet format:

Large datasets that might cause memory issues with CSV/JSON
When you need to process data with pandas, DuckDB, or other analytics tools
When disk space is a concern (parquet is more compressed than CSV)
When you want to preserve exact data types and precision

Reading parquet files in Python:

import pandas as pd
import duckdb

# Using pandas
df = pd.read_parquet('molecular_cdms_1.00e+05_2.00e+05_20260128T153000.parquet')

# Using DuckDB (for SQL queries without loading to memory)
result = duckdb.query(
    "SELECT * FROM 'molecular_cdms_1.00e+05_2.00e+05_20260128T153000.parquet' "
    "WHERE \"Wavelength (m)\" < 0.0002"
).to_df()

SLAP2 VOTable output

# Generate SLAP2-compliant VOTable XML files in default cache directory
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format slap2 \
  --accept-truncation

# Generate SLAP2 VOTables in custom directory
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format slap2 \
  --output /archive/votables/ \
  --accept-truncation

# Generate SLAP2 VOTables for multiple species/nodes
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node=cdms \
  --node=jpl \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format slap2 \
  --output /archive/votables/

Sample output with --format slap2 option:

Querying spectral lines...
Wavelength range: 1000.0 - 2000.0 Angstrom
Filtering for 1 species...
Found 1 species entries matching InChIKeys
Filtering for 1 nodes...
Resolved nodes, found species from 1 node(s)
Fetching lines...
Retrieved atomic data from 1 node(s)

Generating SLAP2-compliant VOTable files...

Generated 2 SLAP2 VOTable file(s) to /archive/votables/:
  slap2_lines_TOPBASE_atom_20251106_150000.xml
    Species type: atomic
    Lines: 2350
  slap2_lines_TOPBASE_molecule_20251106_150001.xml
    Species type: molecular
    Lines: 145

Key features of SLAP2 VOTable output:

✅ SLAP2-compliant XML format (machine-readable)
✅ Grouped by data node and species type (atomic/molecular)
✅ One XML file per node/species-type combination
✅ Includes vacuum wavelength, transition data, and Einstein coefficients
✅ Compatible with VO-compliant tools and services
✅ Includes metadata: query parameters, timestamps, data sources

Understanding query splitting:

Without --accept-truncation, queries that would return truncated results are automatically split into smaller sub-queries:

# This may be split into multiple sub-queries
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node="http://topbase.obspm.fr/12.07/vamdc/tap//" \
  --lambda-min=0 \
  --lambda-max=90009076900

With --accept-truncation, the query executes as-is even if truncated:

# Executes in one query, may be truncated
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node="http://topbase.obspm.fr/12.07/vamdc/tap//" \
  --lambda-min=0 \
  --lambda-max=90009076900 \
  --accept-truncation

`vamdc count lines` ⭐

Inspect HEAD metadata for spectroscopic line queries without downloading full data. Supports multiple species and multiple nodes. Species and node filters are optional – if not specified, all species across all nodes are queried.

Options:

--inchikey TEXT: InChIKey of the species (can be specified multiple times, optional)
--node TEXT: Node identifier (can be specified multiple times, optional)
--lambda-min FLOAT: Minimum wavelength in Angstrom (default: 0.0)
--lambda-max FLOAT: Maximum wavelength in Angstrom (default: 1.0e9)

Use cases:

Query all available species across all nodes in a wavelength range
Query specific species only (filter by --inchikey)
Query specific nodes only (filter by --node)
Query specific species from specific nodes (both filters)

Examples:

Query all species across all nodes

# Get metadata for all data in a wavelength range
vamdc count lines \
  --lambda-min=0 \
  --lambda-max=90009076900

Query all nodes for a specific species

# Get metadata for a species from all nodes that have it
vamdc count lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --lambda-min=0 \
  --lambda-max=90009076900

Query all species from a specific node (using short name)

# Get metadata for all species from a specific node
vamdc count lines \
  --node=topbase \
  --lambda-min=0 \
  --lambda-max=90009076900

Single species, single node (using short name)

vamdc count lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=0 \
  --lambda-max=90009076900

Multiple species, multiple nodes (using short names)

vamdc count lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node=cdms \
  --node=jpl \
  --lambda-min=100000 \
  --lambda-max=200000

Query all species from a specific node

# Get metadata for all species from a specific node
vamdc count lines \
  --node="http://topbase.obspm.fr/12.07/vamdc/tap//" \
  --lambda-min=0 \
  --lambda-max=90009076900

Single species, single node

vamdc count lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node="http://topbase.obspm.fr/12.07/vamdc/tap//" \
  --lambda-min=0 \
  --lambda-max=90009076900

Sample output (all species, all nodes):

Inspecting metadata for spectral lines...
Wavelength range: 0.0 - 90009076900.0 Angstrom
No species or node filters provided; querying all species across all nodes.
Fetching metadata (HEAD requests only)...

Sub-query 1: http://topbase.obspm.fr/12.07/vamdc/tap//sync?LANG=VSS2&REQUEST=doQuery...
  vamdc-approx-size: 66.90
  vamdc-count-radiative: 47778
  vamdc-count-species: 1
  vamdc-count-states: 1007
  vamdc-request-token: topbase:ebfda65c-83d3-4d10-a08b-1213b0a6bf7f:head
  vamdc-truncated: 20.9

Aggregated numeric headers across 1 sub-queries:
  vamdc-approx-size: 66.9
  vamdc-count-radiative: 47778
  vamdc-count-species: 1
  vamdc-count-states: 1007
  vamdc-truncated: 20.9

Multiple species, multiple nodes

vamdc count lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node="https://cdms.astro.uni-koeln.de/cdms/tap/" \
  --node="http://basecoltap2015.vamdc.org/12_07/TAP/" \
  --lambda-min=100000 \
  --lambda-max=200000

Multiple species, multiple nodes (using short names)

vamdc count lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node=cdms \
  --node=jpl \
  --lambda-min=100000 \
  --lambda-max=200000

This command performs HEAD requests to retrieve VAMDC count headers without downloading full datasets, showing:

Individual metadata per sub-query
Aggregated totals across all sub-queries
Truncation status
Estimated data sizes

`vamdc cache status`

Show cache status and metadata, including XSAMS files.

Example:

vamdc cache status

Sample output:

Cache directory: /Users/username/.cache/vamdc
Expiration time: 24 hours

Nodes: VALID (cached at 2025-10-21 14:59:35.657232)
Species: VALID (cached at 2025-10-21 14:59:43.941104)
Species Nodes: VALID (cached at 2025-10-21 14:59:43.941198)

XSAMS files: 1 file(s), 8.77 MB

Output shows:

Cache directory location
Expiration time (24 hours)
Status of each cached dataset (VALID, EXPIRED, or NOT CACHED)
Cache timestamps
XSAMS files count and total size

`vamdc cache clear`

Remove all cached data including XSAMS files.

Example:

vamdc cache clear

This removes:

Nodes cache
Species cache
Species-nodes mapping
All cached XSAMS files

`vamdc convert energy` 🔄

Convert between electromagnetic units (energy, frequency, wavelength). Supports conversions across different physical quantities using fundamental physical constants.

Arguments:

VALUE: The numerical value to convert (required, positional)

Options:

-f, --from-unit TEXT: Source unit (required)
-t, --to-unit TEXT: Target unit (required)

Supported Units:

Category	Units
Energy	joule, millijoule, microjoule, nanojoule, picojoule, eV, erg, kelvin, rydberg, cm-1
Frequency	hertz, kilohertz, megahertz, gigahertz, terahertz
Wavelength	meter, centimeter, millimeter, micrometer, nanometer, angstrom

Features:

✅ Case-insensitive unit names
✅ Cross-category conversions (e.g., wavelength → energy)
✅ Smart output formatting (scientific notation for very large/small values)
✅ Verbose mode with category conversion details

Examples:

Basic conversions

# Convert 500 nanometers to electron volts
vamdc convert energy 500 --from-unit=nanometer --to-unit=eV
# Output: 2.479683969 eV

# Convert 1.5 eV to wavenumber (cm-1)
vamdc convert energy 1.5 --from-unit=eV --to-unit=cm-1
# Output: 12098.31591 cm-1

# Convert 3000 angstroms to nanometers
vamdc convert energy 3000 -f angstrom -t nanometer
# Output: 300 nanometer

# Convert frequency to wavelength
vamdc convert energy 100 --from-unit=gigahertz --to-unit=meter
# Output: 0.00299792458 meter

Case-insensitive input

# Units are case-insensitive - all of these work:
vamdc convert energy 500 --from-unit=NANOMETER --to-unit=EV
vamdc convert energy 500 --from-unit=NanoMeter --to-unit=eV
vamdc convert energy 500 --from-unit=nanometer --to-unit=ev
# All produce: 2.479683969 eV

With verbose mode

# Show conversion details and category information
vamdc --verbose convert energy 100 -f gigahertz -t meter
# Output:
# 0.00299792458 meter
# Conversion details:
#   Input: 100.0 gigahertz
#   Output: 0.00299792458 meter
#   Category conversion: frequency → wavelength

Scientific notation for extreme values

# Very small numbers
vamdc convert energy 0.0001 --from-unit=joule --to-unit=eV
# Output: 6.241509e+14 eV

# Very large numbers
vamdc convert energy 1e-10 --from-unit=meter --to-unit=angstrom
# Output: 1e+00 angstrom

Cross-category conversions

The converter intelligently handles conversions between different physical quantities:

# Energy → Frequency
vamdc convert energy 1.5 --from-unit=eV --to-unit=terahertz

# Energy → Wavelength
vamdc convert energy 2.479683969 --from-unit=eV --to-unit=nanometer

# Frequency → Wavelength
vamdc convert energy 100 --from-unit=gigahertz --to-unit=meter

# Wavelength → Energy
vamdc convert energy 500 --from-unit=nanometer --to-unit=eV

# Temperature (in Kelvin) → Energy (in eV)
vamdc convert energy 11604.5 --from-unit=kelvin --to-unit=eV

Common conversions:

# Visible light range conversions
# Red: 700 nm
vamdc convert energy 700 -f nanometer -t eV
# Output: 1.771390 eV

# Green: 550 nm
vamdc convert energy 550 -f nanometer -t eV
# Output: 2.254581 eV

# Violet: 400 nm
vamdc convert energy 400 -f nanometer -t eV
# Output: 3.099019 eV

# Spectroscopic wavenumber
vamdc convert energy 5000 -f cm-1 -t eV
# Output: 0.619947 eV

# Radio frequency
vamdc convert energy 1.4 -f gigahertz -t meter
# Output: 0.214285714 meter (21.4 cm wavelength - common in radio astronomy)

Error handling:

Invalid unit specifications show all supported units:

vamdc convert energy 500 --from-unit=invalid --to-unit=eV
# Error: Invalid from-unit 'invalid'. Supported units:
#   energy: joule, millijoule, microjoule, nanojoule, picojoule, eV, erg, kelvin, rydberg, cm-1
#   frequency: hertz, kilohertz, megahertz, gigahertz, terahertz
#   wavelength: meter, centimeter, millimeter, micrometer, nanometer, angstrom

Use cases:

Convert spectral line wavelengths to energies:

# Convert observed wavelength (Angstroms) to eV for comparison with theory
vamdc convert energy 4861 -f angstrom -t eV  # Hydrogen Balmer alpha
# Output: 2.550169 eV

Convert between observational and theoretical units:

# Convert radio frequency observation to wavelength
vamdc convert energy 345 -f gigahertz -t millimeter
# Output: 0.869565 millimeter (for CO line in radio astronomy)

Temperature to energy for thermal populations:

# Convert room temperature to energy
vamdc convert energy 300 -f kelvin -t meV
vamdc convert energy 300 -f kelvin -t cm-1

Pipeline integration:

# Use in shell scripts
wavelength=500
energy=$(vamdc convert energy $wavelength -f nanometer -t eV | awk '{print $1}')
echo "Wavelength: ${wavelength} nm = ${energy} eV"

Caching System

The CLI automatically caches downloaded data to avoid redundant network requests.

Cache location:

Default: ~/.cache/vamdc/
Override with VAMDC_CACHE_DIR environment variable

Cached data:

nodes.csv - VAMDC data nodes
species.csv - Chemical species database (4958+ species)
species_nodes.csv - Species-to-node mappings
xsams/ - XSAMS XML files directory
- Raw XSAMS XML files from queries
votables/ - SLAP2 VOTable XML files directory
- Generated by --slap2 flag on get lines command
*_timestamp.json - Metadata files tracking cache timestamps

Cache expiration:

Metadata (nodes, species): 24 hours from last fetch
XSAMS files: No automatic expiration (managed by user)
VOTable files: No automatic expiration (managed by user)
Use --refresh flag to force metadata update
Check status with vamdc cache status

XSAMS files management:

Default location: ~/.cache/vamdc/xsams/
Files named by query token: <node>:<token>:get.xsams

SLAP2 VOTable files management:

Default location: ~/.cache/vamdc/votables/
Sources:
- Species VOTables (from vamdc get species --slap2):
  - Named pattern: slap2_species_{NODE_NAME}_{TIMESTAMP}.xml
  - One file per data node (as per SLAP2 specification)
  - Examples:
    - slap2_species_CDMS_20251106_150000.xml
    - slap2_species_JPL_20251106_150001.xml
- Lines VOTables (from vamdc get lines --slap2):
  - Named pattern: slap2_lines_{NODE}_{SPECIES_TYPE}_{TIMESTAMP}.xml
  - One file per node and species type (atomic/molecular)
  - Examples:
    - slap2_lines_TOPBASE_atom_20251106_150000.xml
    - slap2_lines_CDMS_molecule_20251106_150001.xml
View count and size: vamdc cache status
Clear all XSAMS and VOTables: vamdc cache clear

Environment Variables

`VAMDC_CACHE_DIR`

Override default cache directory location.

Example:

export VAMDC_CACHE_DIR=~/my_vamdc_cache
vamdc get species  # Uses ~/my_vamdc_cache/

`VAMDC_LOG_LEVEL`

Control logging verbosity globally without using CLI flags.

Supported values: SILENT, MINIMAL, NORMAL, VERBOSE, DEBUG

Examples:

# Silent mode - no error messages
export VAMDC_LOG_LEVEL=SILENT
vamdc get species

# Minimal mode - one-line errors (ideal for AI agents)
export VAMDC_LOG_LEVEL=MINIMAL
vamdc get lines --inchikey=...

# Normal mode - standard error messages (default)
export VAMDC_LOG_LEVEL=NORMAL
vamdc get species

# Verbose mode - detailed logging
export VAMDC_LOG_LEVEL=VERBOSE
vamdc count lines --lambda-min=1000 --lambda-max=2000

# Debug mode - full tracebacks
export VAMDC_LOG_LEVEL=DEBUG
vamdc get lines --inchikey=...

Combining with cache directory:

# Set both environment variables
export VAMDC_CACHE_DIR=~/my_vamdc_cache
export VAMDC_LOG_LEVEL=MINIMAL
vamdc get species

Note: CLI flags (--quiet, --verbose, --debug) override the VAMDC_LOG_LEVEL environment variable.

Finding Species InChIKeys

To find the InChIKey for a species:

# Download species list
vamdc get species --format csv --output species.csv

# Search for your species (e.g., CO)
grep -i "CO" species.csv

# Or use the filter option
vamdc get species --filter-by "name:CO"

Pro tip: The species database includes:

InChIKey (unique identifier)
Chemical formula
Species name
Species type (atom/molecule)
Available nodes (TAP endpoints)

Common Workflows

Explore available data

# List all nodes (32 data centers)
vamdc get nodes

# Get full species database (4958+ species)
vamdc get species --format csv --output species.csv

# Find a specific molecule
vamdc get species --filter-by "name:H2O"

# Check which nodes have your species
vamdc get species --filter-by "name:CO" | grep -i "tapEndpoint"

Query spectral lines efficiently

# Step 1: Find the InChIKey
vamdc get species --filter-by "name:Ca"
# Result: DONWDOGXJBIXRQ-UHFFFAOYSA-N

# Step 2: Check available data (HEAD request only)
vamdc count lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=1000 \
  --lambda-max=2000

# Step 3: Download the data using short node name
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format csv \
  --output ca_lines.csv \
  --accept-truncation

Explore data availability across all sources

# Check how much data is available in a wavelength range (no filters)
vamdc count lines \
  --lambda-min=1000 \
  --lambda-max=2000

# This queries all species from all nodes without filtering
# Useful for understanding data coverage across the entire VAMDC infrastructure

Query multiple species simultaneously

# Get data for multiple molecules in one command
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --inchikey=UGFAIRIUMAVXCW-UHFFFAOYSA-N \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format csv \
  --output multiple_species.csv

Compare data from multiple nodes

# Get the same species from different databases using short names
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --node=jpl \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format csv \
  --output co_comparison.csv

# The output CSV includes a 'node' column to identify the source
# You can also mix different identifier types:
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --node="ivo://vamdc/jpl/vamdc-tap_12.07" \
  --node="http://basecoltap2015.vamdc.org/12_07/TAP/" \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format csv \
  --output co_all_sources.csv

Work with XSAMS files and VOTable files

# Download XSAMS to cache using short node name
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format xsams \
  --accept-truncation

# Check XSAMS cache status
vamdc cache status

# Download to custom directory for archiving
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --format xsams \
  --output /archive/2025/calcium/ \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --accept-truncation

# Download from multiple nodes
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --node=chianti \
  --format xsams \
  --output /archive/2025/calcium/ \
  --accept-truncation

# Generate SLAP2 VOTable files in default cache directory
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format slap2 \
  --accept-truncation

# Generate SLAP2 VOTable files in custom directory
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --node=topbase \
  --node=chianti \
  --lambda-min=1000 \
  --lambda-max=2000 \
  --format slap2 \
  --output /archive/2025/votables/ \
  --accept-truncation

# Generate SLAP2 VOTables for multiple species from multiple sources
vamdc get lines \
  --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=topbase \
  --node=cdms \
  --lambda-min=1000 \
  --lambda-max=10000 \
  --format slap2 \
  --output /archive/2025/votables/

Node Identifiers

The --node parameter accepts three types of identifiers with intelligent resolution:

Supported Node Identifier Types

Short name (most convenient):
```
--node="cdms"
--node="jpl"
--node="topbase"
```
- Short, memorable identifiers for common nodes
- Case-insensitive matching
- Example: vamdc get lines --inchikey=... --node=cdms
IVO identifier (programmatic use):
```
--node="ivo://vamdc/TOPbase/tap-xsams"
--node="ivo://vamdc/cdms/vamdc-tap_12.07"
```
- Full Virtual Observatory identifier
- Unambiguous and machine-readable
- Example: vamdc get lines --inchikey=... --node="ivo://vamdc/cdms/vamdc-tap_12.07"
TAP endpoint URL (full endpoint):
```
--node="http://topbase.obspm.fr/12.07/vamdc/tap//"
--node="https://cdms.astro.uni-koeln.de/cdms/tap/"
```
- Complete TAP endpoint URL
- Most explicit identifier
- Example: vamdc get lines --inchikey=... --node="https://cdms.astro.uni-koeln.de/cdms/tap/"

Resolution Strategy

The CLI uses intelligent 4-step resolution to convert any identifier to a full TAP endpoint:

Step 1: Try matching as TAP endpoint (full URL)
  └─ If not found → continue to Step 2

Step 2: Try matching as IVO identifier
  └─ If not found → continue to Step 3

Step 3: Try matching as short name
  └─ If found → Return endpoint ✓

Step 4: Try matching against nodes table (fallback)
  └─ If not found → Raise error with helpful message

Example resolution flow:

User input: "cdms"
├─ Step 1: Is it "https://..." URL? No
├─ Step 2: Is it "ivo://..." ID? No
├─ Step 3: Is it a short name "cdms"? Yes ✓
└─ Result: "https://cdms.astro.uni-koeln.de/cdms/tap/"

Finding Node Identifiers

Get all available node identifiers:

# View all nodes with their identifiers
vamdc get nodes --format csv

# View specific columns
vamdc get nodes --format csv | cut -d',' -f1,2,3

# Search for a specific node (e.g., CDMS)
vamdc get nodes --format csv | grep -i "cdms"

Output includes:

shortName: Short identifier (e.g., "CDMS")
ivoIdentifier: Full IVO ID (e.g., "ivo://vamdc/cdms/vamdc-tap_12.07")
tapEndpoint: Full TAP URL (e.g., "https://cdms.astro.uni-koeln.de/cdms/tap/")

Examples by Identifier Type

Using short name (RECOMMENDED)

# Simple and readable
vamdc get lines --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N --node=cdms
vamdc get lines --inchikey=DONWDOGXJBIXRQ-UHFFFAOYSA-N --node=topbase
vamdc get lines --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N --node=basecol2015

# Multiple nodes using short names
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms --node=jpl --node=basecol2015

Using IVO identifier

# Explicit and unambiguous
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node="ivo://vamdc/cdms/vamdc-tap_12.07"

# Multiple nodes using IVO identifiers
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node="ivo://vamdc/cdms/vamdc-tap_12.07" \
  --node="ivo://vamdc/basecol2015/vamdc-tap"

Using TAP endpoint URL

# Full endpoint
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node="https://cdms.astro.uni-koeln.de/cdms/tap/"

# Mixed identifiers (all types work together)
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --node="ivo://vamdc/jpl/vamdc-tap_12.07" \
  --node="http://basecoltap2015.vamdc.org/12_07/TAP/"

Error Handling

When an invalid node identifier is provided:

# Invalid short name
vamdc get lines --inchikey=... --node=invalid_xyz
# Error: No node matching 'invalid_xyz' was found.
#        Try using a full TAP endpoint URL, short name
#        (e.g., 'cdms'), or IVO identifier.

To troubleshoot:

List all available nodes: vamdc get nodes
Check the short name, IVO ID, or endpoint format
Verify the node has data for your species

Species Identifiers

The --inchikey parameter identifies chemical species for queries. The CLI now supports intelligent species identification with flexible matching.

Understanding InChIKey

An InChIKey is a unique, standardized identifier for chemical substances. It's a fixed-length character string derived from the IUPAC International Chemical Identifier (InChI).

Format:

OKTJSMMVPCPJKN-UHFFFAOYSA-N
│                           │
│                           └─ Protonation layer indicator
├─ Main layer (14 chars)
└─ First InChI layer (10 chars)

Example InChIKeys:

Carbon: OKTJSMMVPCPJKN-UHFFFAOYSA-N
Carbon Monoxide (CO): LFQSCWFLJHTTHZ-UHFFFAOYSA-N
Water (H₂O): XLYOFNOQVPJJNP-UHFFFAOYSA-N

Finding Species InChIKeys

Method 1: Search the species database

# Get all species with "CO" in the name
vamdc get species --filter-by "name:CO"

# Output shows InChIKey and other properties
InChIKey    name           formula  speciesType
LFQSCWFLJHTTHZ-UHFFFAOYSA-N  carbon monoxide  CO  molecule

Method 2: Export and search

# Export full species database
vamdc get species --format csv --output species.csv

# Search for specific species
grep -i "carbon" species.csv | head -5

Method 3: Query single species

# Query a specific molecule (CO) from a node
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --lambda-min=100000 \
  --lambda-max=200000

Method 4: Query species with short node names

# Query water (H₂O) from multiple nodes using short names
vamdc get lines \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node=cdms \
  --node=jpl \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format csv \
  --output water_lines.csv

Common Species InChIKeys

Here are some frequently-used species:

Species	InChIKey	Type
Hydrogen (H)	`UFHXOROCNITJBY-UHFFFAOYSA-N`	atom
Helium (He)	`SWQJXJOGLNCZEY-UHFFFAOYSA-N`	atom
Carbon (C)	`OKTJSMMVPCPJKN-UHFFFAOYSA-N`	atom
Nitrogen (N)	`IJDNQMJBXVCW-UHFFFAOYSA-N`	atom
Oxygen (O)	`QVGXLLKGJNJLOE-UHFFFAOYSA-N`	atom
Carbon Monoxide (CO)	`LFQSCWFLJHTTHZ-UHFFFAOYSA-N`	molecule
Water (H₂O)	`XLYOFNOQVPJJNP-UHFFFAOYSA-N`	molecule
Ammonia (NH₃)	`QGZKDVFQNNGYKY-UHFFFAOYSA-N`	molecule
Methane (CH₄)	`VNWKTOKETHGBQM-UHFFFAOYSA-N`	molecule

Species Resolution for Queries

Unlike node identifiers, species are always identified by InChIKey. However, the CLI provides intelligent features:

Multiple species in one query:

vamdc get lines \
  --inchikey=OKTJSMMVPCPJKN-UHFFFAOYSA-N \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node=cdms \
  --lambda-min=1000 --lambda-max=10000

Automatic species validation: The CLI checks if the InChIKey exists in the database

# Invalid InChIKey
vamdc get lines --inchikey=INVALID-INCHIKEY-XXX --node=cdms
# Error: No species with InChIKey 'INVALID-INCHIKEY-XXX' were found.

Specifies available nodes for each species: Automatically identifies which nodes have data for each species

# The CLI internally checks which of your specified nodes have this species
vamdc get lines \
  --inchikey=OKTJSMMVPCPJKN-UHFFFAOYSA-N \
  --node=cdms --node=topbase --node=vald
# Queries all specified nodes that have this species

Query species with short node names:

# Query carbon from multiple nodes using short names
vamdc get lines \
  --inchikey=OKTJSMMVPCPJKN-UHFFFAOYSA-N \
  --node=topbase \
  --node=chianti \
  --node=vald \
  --lambda-min=1000 \
  --lambda-max=10000 \
  --format csv \
  --output carbon_lines.csv

Workflow: Find and Query Species

Step 1: Find the InChIKey

# Search for a species by name or formula
vamdc get species --filter-by "name:CO"

# Find column headers
vamdc get species --format csv | head -1

Step 2: Identify available nodes

# Export species info and check available nodes
vamdc get species --format csv --output species.csv

# View data for specific species
grep "LFQSCWFLJHTTHZ-UHFFFAOYSA-N" species.csv

# See which nodes have this species
vamdc get nodes --format csv | grep -i "cdms"

Step 3: Check available data

# Use count_lines to see data availability
vamdc count lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --lambda-min=1000 \
  --lambda-max=10000

Step 4: Download the data

# Download spectral lines
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --node=cdms \
  --lambda-min=1000 \
  --lambda-max=10000 \
  --format csv \
  --output co_lines.csv

Combining Multiple Species and Nodes

Query multiple species from multiple nodes:

# Compare CO and H2O across different databases
vamdc get lines \
  --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
  --inchikey=XLYOFNOQVPJJNP-UHFFFAOYSA-N \
  --node=cdms \
  --node=jpl \
  --node=basecol2015 \
  --lambda-min=100000 \
  --lambda-max=200000 \
  --format csv \
  --output molecules.csv

Output CSV will include:

All spectral line data
node column: which database the line came from
species_type column: atom or molecule

Error Handling for Species

Invalid InChIKey format:

vamdc get lines --inchikey=INVALID-KEY --node=cdms
# Error: No species with InChIKey 'INVALID-KEY' were found.

Solutions:

Check spelling with vamdc get species --filter-by "name:..."
List all available species: vamdc get species --format csv
Use the species database export to find exact InChIKey

Pro Tips

Save common InChIKeys: Create a reference file

cat > species_inchikeys.txt << EOF
# Molecules
CO=LFQSCWFLJHTTHZ-UHFFFAOYSA-N
H2O=XLYOFNOQVPJJNP-UHFFFAOYSA-N
NH3=QGZKDVFQNNGYKY-UHFFFAOYSA-N
EOF

Query in a loop:

while read inchikey; do
  vamdc get lines \
    --inchikey="$inchikey" \
    --node=cdms \
    --lambda-min=1000 --lambda-max=10000 \
    --format csv \
    --output "lines_${inchikey}.csv"
done < species_inchikeys.txt

Combine with node iteration:

# Query a species from all available nodes
for node in cdms jpl topbase basecol2015 vald chianti; do
  vamdc get lines \
    --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
    --node="$node" \
    --lambda-min=100000 --lambda-max=200000 \
    --format csv \
    --output "co_${node}.csv" 2>/dev/null || echo "No data from $node"
done

Performance Tips

Use count lines before downloading: Check data size first

vamdc count lines --inchikey=... --node=... --lambda-min=... --lambda-max=...

Use --accept-truncation for large queries: Avoid automatic splitting
```
vamdc get lines ... --accept-truncation
```

Query multiple species/nodes in one command: Leverages parallel processing

vamdc get lines --inchikey=SPECIES1 --inchikey=SPECIES2 --inchikey=SPECIES3 ...

Use cache: Metadata is cached for 24 hours

# First call: downloads metadata
vamdc get species

# Subsequent calls: uses cache (fast)
vamdc get species --filter-by "name:..."

Narrow wavelength ranges: Reduces data volume and query time

# Instead of querying the full spectrum
--lambda-min=0 --lambda-max=1000000000

# Use targeted ranges
--lambda-min=1000 --lambda-max=2000

Troubleshooting

"Node not found" error

Ensure you're using a valid node identifier. Check available nodes:

vamdc get nodes --format csv

Verify the node has a TAP endpoint (some nodes may not support queries):

vamdc get nodes --format csv | grep -v ",,"

"No species with InChIKey ... found"

Verify the InChIKey is correct:

vamdc get species --format csv --output species.csv
grep "YOUR_INCHIKEY" species.csv

"No matching data were found"

This can occur if:

The species is not available in the specified node
The wavelength range has no data
The node/species combination is invalid

Check what's available:

# Find which nodes have your species
vamdc get species --filter-by "InChIKey:YOUR_INCHIKEY"

# Try a broader wavelength range
--lambda-min=0 --lambda-max=1000000000

"Number of processes must be at least 1"

This occurs when no matching species/node combinations are found. Verify:

The InChIKey exists in the species database
The node identifier is correct
The node has data for that species

Cache issues

Clear the cache if you experience unexpected behavior:

vamdc cache clear
vamdc get species --refresh  # Rebuild cache

Enable verbose output or debug mode

For troubleshooting, use the --verbose or --debug flags:

# Verbose mode - detailed context and logging
vamdc --verbose get lines --inchikey=... --node=... --lambda-min=... --lambda-max=...

# Debug mode - full stack traces and diagnostic information
vamdc --debug get lines --inchikey=... --node=... --lambda-min=... --lambda-max=...

# Quiet mode - minimal output (useful for checking if command succeeds)
vamdc --quiet get lines --inchikey=... --node=... --lambda-min=... --lambda-max=...

When debugging:

Start with --verbose to see what's happening
Use --debug if you need full tracebacks
Check the error messages for specific issues (node not found, invalid InChIKey, etc.)

Query takes too long

Use count lines to check data volume first
Add --accept-truncation to prevent automatic query splitting
Narrow your wavelength range
Query fewer species/nodes simultaneously

XSAMS files filling up disk

Check XSAMS cache size:

vamdc cache status

Clear XSAMS files:

vamdc cache clear

Or manually remove specific files:

rm ~/.cache/vamdc/xsams/*.xsams

Getting Help

View command help:

vamdc --help
vamdc get --help
vamdc get nodes --help
vamdc get species --help
vamdc get lines --help
vamdc count --help
vamdc count lines --help
vamdc cache --help

Advanced Examples

Query all data for a specific wavelength range

# Get all available species in UV range (no filters)
vamdc get lines \
  --lambda-min=1000 \
  --lambda-max=4000 \
  --format csv \
  --output uv_lines.csv \
  --accept-truncation

Pipeline with filtering

# Get species list, filter, then query
vamdc get species --format csv --output species.csv
awk -F',' '$5=="molecule" {print $6}' species.csv > molecule_inchikeys.txt

# Query first 3 molecules
head -3 molecule_inchikeys.txt | while read inchikey; do
  vamdc get lines \
    --inchikey="$inchikey" \
    --lambda-min=100000 \
    --lambda-max=200000 \
    --format csv \
    --output "lines_${inchikey}.csv" \
    --accept-truncation
done

Check metadata for multiple sources

# Compare data availability across nodes
for node in "https://cdms.astro.uni-koeln.de/cdms/tap/" \
            "https://cdms.astro.uni-koeln.de/jpl/tap/" \
            "http://basecoltap2015.vamdc.org/12_07/TAP/"; do
  echo "=== $node ==="
  vamdc count lines \
    --inchikey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N \
    --node="$node" \
    --lambda-min=100000 \
    --lambda-max=200000 \
    2>/dev/null || echo "No data"
done

API Wrapper

The CLI uses high-level wrapper functions:

lines_module.getLines() - Downloads and converts data
lines_module.get_metadata_for_lines() - HEAD requests only
lines_module._build_and_run_wrappings() - Internal parallel processing

These provide better performance and flexibility compared to direct VamdcQuery instantiation.

Logging System Architecture

The CLI uses a sophisticated, configurable logging system designed to adapt output verbosity for different use cases, from AI agents requiring minimal context to developers needing full diagnostic information.

Core Components

1. LogLevel Enum (`spectral/logging_config.py`)

Defines five verbosity levels:

class LogLevel(Enum):
    SILENT = 0    # No output except results
    MINIMAL = 1   # One-line error summaries
    NORMAL = 2    # Standard error messages (default)
    VERBOSE = 3   # Detailed messages with context
    DEBUG = 4     # Full tracebacks

2. SmartLogger Class

Context-aware logger that adapts its output based on the global log level:

SILENT: No error output
MINIMAL: Error: {message}: {ExceptionType} to stderr
NORMAL: Formatted logging with exception details
VERBOSE: Detailed context including module names
DEBUG: Complete stack traces

3. Global Configuration

The logging system can be configured via:

CLI flags: --quiet, --verbose, --debug (highest priority)
Environment variable: VAMDC_LOG_LEVEL (fallback)
Default: NORMAL mode

Error Handling Pattern

All errors in the codebase follow this pattern:

from pyVAMDC.spectral.logging_config import get_logger

LOGGER = get_logger(__name__)

try:
    # Operation that might fail
    result = risky_operation()
except SpecificException as e:
    LOGGER.error(
        "Clear description of what failed",
        exception=e,
        show_traceback=False  # Set to True for unexpected errors
    )
    # Handle gracefully

Module Integration

Modified modules:

spectral/species.py: Replaced print() and bare except clauses with SmartLogger
spectral/vamdcQuery.py: Removed _display_message() function and verbose parameter; replaced with SmartLogger
spectral/lines.py: Removed verbose parameter from all functions; updated print() to logger.info()
spectral/cli.py: Integrated verbosity flags (--quiet, --verbose, --debug) and traceback control

Key API Changes:

✅ Removed verbose boolean parameter from VamdcQuery.__init__(), getLines(), get_metadata_for_lines(), and getLinesByTelescopeBand()
✅ Logging verbosity now controlled globally via CLI flags or VAMDC_LOG_LEVEL environment variable
✅ Debug messages (query creation, splitting, status) controlled by log level instead of function parameters

For Developers

When adding new functions that might generate errors:

from pyVAMDC.spectral.logging_config import get_logger

LOGGER = get_logger(__name__)

def your_function():
    try:
        # Your code
        pass
    except ValueError as e:
        # Known error type - don't show traceback
        LOGGER.error(
            f"Invalid value for parameter X: {value}",
            exception=e,
            show_traceback=False
        )
    except Exception as e:
        # Unexpected error - show traceback in DEBUG mode
        LOGGER.error(
            f"Unexpected error in your_function",
            exception=e,
            show_traceback=True
        )

Benefits

✅ AI-Friendly: --quiet mode prevents context saturation for AI agents
✅ User-Friendly: Default mode balances information and clarity
✅ Developer-Friendly: --debug mode provides complete diagnostic information
✅ Consistent: All modules use the same logging system
✅ Flexible: Control via CLI, environment variables, or programmatically

Testing Logging Levels

Test different verbosity levels:

# Silent mode - no errors shown
export VAMDC_LOG_LEVEL=SILENT
vamdc get species

# Minimal mode - one-line errors
vamdc --quiet get species

# Normal mode - standard errors
vamdc get species

# Verbose mode - detailed logging
vamdc --verbose get species

# Debug mode - full tracebacks
vamdc --debug get species

Acknowledgments

This CLI interfaces with the VAMDC (Virtual Atomic and Molecular Data Centre) infrastructure, which aggregates spectroscopic data from multiple international databases.

FilesExpand file tree

CLI.md

Latest commit

History

CLI.md

File metadata and controls

VAMDC Command-Line Interface

Installation

Recommended: Using uv

Alternative: Direct execution

Command Structure

Features

Global Options

Verbosity Flags (Mutually Exclusive)

Output Levels Explained

Environment Variable Control

Examples by Verbosity Level

Minimal Output (AI-Friendly)

Normal Output (Default)

Verbose Output (Detailed Context)

Debug Output (Full Tracebacks)

When to Use Each Level

Commands

vamdc get nodes

vamdc get species

vamdc get lines ⭐

Single species, single node (using short name)

Multiple species, single node (using short name)

Single species, multiple nodes (using short names)

Mixed identifier types (short names, IVO IDs, endpoints)

All available species/nodes in wavelength range

XSAMS format output

CSV output with multiple sources

Parquet format output (memory-efficient for large datasets)

SLAP2 VOTable output

vamdc count lines ⭐

Query all species across all nodes

Query all nodes for a specific species

Query all species from a specific node (using short name)

Single species, single node (using short name)

Multiple species, multiple nodes (using short names)

Query all species from a specific node

Single species, single node

Multiple species, multiple nodes

Multiple species, multiple nodes (using short names)

vamdc cache status

vamdc cache clear

vamdc convert energy 🔄

Basic conversions

Case-insensitive input

With verbose mode

Scientific notation for extreme values

Cross-category conversions

Caching System

Environment Variables

VAMDC_CACHE_DIR

VAMDC_LOG_LEVEL

Finding Species InChIKeys

Common Workflows

Explore available data

Query spectral lines efficiently

Explore data availability across all sources

Query multiple species simultaneously

Compare data from multiple nodes

Work with XSAMS files and VOTable files

Node Identifiers

Supported Node Identifier Types

Resolution Strategy

Finding Node Identifiers

Examples by Identifier Type

Using short name (RECOMMENDED)

Using IVO identifier

Using TAP endpoint URL

Error Handling

Species Identifiers

Understanding InChIKey

Finding Species InChIKeys

Method 1: Search the species database

Method 2: Export and search

Method 3: Query single species

`vamdc get nodes`

`vamdc get species`

`vamdc get lines` ⭐

`vamdc count lines` ⭐

`vamdc cache status`

`vamdc cache clear`

`vamdc convert energy` 🔄

`VAMDC_CACHE_DIR`

`VAMDC_LOG_LEVEL`

1. LogLevel Enum (`spectral/logging_config.py`)