Skip to content

Document direct querying, helper functions, reproducibility #5

@bbest

Description

@bbest

Context

The docs Quarto book has a comprehensive maps.qmd and a good db.qmd, but api.qmd is a near-empty stub and there's no page on querying the GCS parquet data directly or with the new calcofi4r helpers.

Goal

Document how to query CalCOFI data three ways — direct SQL, calcofi4r helpers, and the int-app download — with a reproducibility story tying them together.

Tasks

  • New data-access.qmd — direct DuckDB + GCS parquet querying: httpfs setup, single-file vs hive-partitioned read_parquet examples, ## Reproducibility section explaining the int-app download query/ folder
  • New helpers.qmd — the calcofi4r bio↔env matching helpers with worked examples
  • Expand api.qmd — add a "superseded by calcofi4r helpers + direct querying" callout
  • _quarto.yml — insert data-access.qmd and helpers.qmd after db.qmd, before api.qmd
  • Recurring worked example across all three pages: "Pacific sardine larvae + temperature, Q1 2023, relaxed matching" shown three ways — direct SQL, cc_match_ichthyo_by_name(), int-app download — all producing identical rows
  • Verify: quarto render docs/; confirm the worked-example SQL matches attr(cc_match_ichthyo_by_name(...), "sql")

Blocked by: CalCOFI/workflows#51, CalCOFI/apps#40, CalCOFI/calcofi4r#10, CalCOFI/int-app#5. Final issue of a 5-issue epic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions