HTTP and Web

HTTP & Web

Tier: Intermediate Commands covered: fetch, fetchpost

Per-command flag reference lives in /docs/help/. This page is the workflow layer — when to reach for each command and how they compose.

Two commands for hitting HTTP APIs from CSV data, one row at a time. Both have:

HTTP/2 with adaptive flow control for high throughput
RFC RateLimit-aware dynamic throttling that adapts to the server's rate-limit headers
jaq (a jq clone) integration for extracting fields from JSON responses
Four caching options: in-memory LRU (default, 2M entries), persistent disk cache, Redis cache, or no cache

fetch does HTTP GET. fetchpost does HTTP POST with HTML form encoding (default) or a MiniJinja-templated JSON body.

For the deep-dive, see docs/Fetch.md.

Quick decision table

If you want to…	Use	Notes
Enrich each row by GET-ing a URL	`fetch`	URL column or `--url-template`
POST each row to a service (form-encoded)	`fetchpost`	One column = one form field
POST each row as JSON via a MiniJinja template	`fetchpost --payload-tpl`	Template renders any content-type
Extract specific fields from a JSON response	`--jaq '...'`	jq syntax
Cache responses across runs	`--disk-cache` or `--redis-cache`	Avoid re-hitting an API for known inputs
Throttle to a known rate	`--rate-limit N`	Plus auto-throttle on RateLimit headers

`fetch`

HTTP GET per row. Two input styles:

A URL column — the value of one column is the URL to GET.
A URL template — --url-template 'https://api.example.com/users/{user_id}/orders' substitutes column values into a template.

By default fetch writes one minified JSON response per line (JSONL). With --new-column COL, it adds the response (or a --jaq-extracted value) as a new column to the original CSV.

Example: enrich a list of US ZIP codes with city/state (Zippopotamus API)

# data.csv has a "URL" column with values like
#   https://api.zippopotam.us/us/90210
qsv fetch URL data.csv \
  --new-column CityState \
  --jaq '[ .places[0]."place name", .places[0]."state abbreviation" ]' \
  > data_with_city_state.csv

Example: URL template — fetch GitHub stargazers for any repo

# repos.csv has a "repo" column with values like "dathere/qsv"
qsv fetch --url-template 'https://api.github.com/repos/{repo}' \
  --new-column stars \
  --jaq '.stargazers_count' \
  --http-header 'Authorization: Bearer $GITHUB_TOKEN' \
  repos.csv > repos_with_stars.csv

Example: fetch NOAA GHCN-Daily station data with a persistent disk cache

# stations.csv has a "url" column with NOAA station data URLs
qsv fetch url stations.csv \
  --disk-cache \
  --disk-cache-dir ~/.qsv-cache/noaa \
  --new-column raw_data > stations_with_data.csv

The disk cache means a second run reuses the previous downloads — useful for incremental ETL.

Example: rate-limit to 10 requests/sec with auto-back-off on RateLimit headers

qsv fetch URL data.csv --rate-limit 10 --new-column response > with_response.csv

Example: use Redis as the cache (shared across machines / CI runs)

qsv fetch URL data.csv \
  --redis-cache \
  --redis-cache-conn redis://my-redis-host:6379 \
  --new-column response > out.csv

Example: report mode — write a per-row HTTP status / timing TSV instead of merging into output

qsv fetch URL data.csv --report > report.tsv
# Columns: row,url,status,response_time_ms,...

See also: /docs/help/fetch.md, docs/Fetch.md, jaq, Recipe: Fetch & Cache.

`fetchpost`

HTTP POST per row. Two body styles:

HTML form encoding (default, Content-Type: application/x-www-form-urlencoded): each column listed in <column-list> becomes one form field.
MiniJinja-templated payload (--payload-tpl <file>): render any content type. Default is application/json; override with --content-type.

Example: POST each row to an ML inference endpoint as JSON

{# payload.j2 #}
{
  "model": "classifier-v3",
  "features": {
    "text": {{ text | tojson }},
    "country": {{ country | tojson }}
  }
}

qsv fetchpost https://ml.example.com/infer \
  --payload-tpl payload.j2 \
  --new-column prediction \
  --jaq '.label' \
  feedback.csv > feedback_classified.csv

Example: POST form-encoded data (simpler — no template file)

qsv fetchpost https://api.example.com/submit name,email,score \
  --new-column response_id \
  --jaq '.id' \
  responses.csv > responses_logged.csv

The columns name, email, score are sent as name=...&email=...&score=....

Example: bulk OCR on a CSV of image URLs (rate-limited + cached)

qsv fetchpost https://ocr.example.com/extract image_url \
  --rate-limit 5 \
  --disk-cache \
  --new-column extracted_text \
  --jaq '.text' \
  images.csv > images_with_text.csv

Example: custom content type (e.g., XML)

qsv fetchpost https://soap.legacy.example.com/api \
  --payload-tpl xml_envelope.j2 \
  --content-type 'application/xml' \
  --new-column result \
  --jaq '.SOAP-ENV:Envelope.Body.Response.Status' \
  records.csv

See also: /docs/help/fetchpost.md, docs/Fetch.md, MiniJinja, template.

Caching strategy

Both fetch and fetchpost share the same caching options. Pick based on your access pattern:

Cache	Best for	Notes
In-memory LRU (default)	One-shot runs, small datasets	Lost when process exits; 2M entries by default; tune with `--mem-cache-size`
Disk (`--disk-cache`)	Repeated runs against stable APIs	Stored at `~/.qsv-cache/fetch` by default; configurable TTL
Redis (`--redis-cache`)	Distributed teams, CI/CD	Shared cache across machines; needs a Redis server
No cache (`--no-cache`)	Live data that must not be cached	Pricing endpoints, real-time stock data, etc.

For details on cache invalidation, the disk-cache TTL, and Redis connection strings, see docs/Fetch.md.

HTTP and Web

HTTP & Web

Quick decision table

fetch

fetchpost

Caching strategy

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Get Started

Command Reference

Cookbook

Tuning & Internals

Ecosystem

Reference

Legacy

Clone this wiki locally

`fetch`

`fetchpost`