Skip to content

analytics: enable GCS data access audit logs for integrated database #50

@bbest

Description

@bbest

Context

Reads of CalCOFI parquet/DuckDB files on Google Cloud Storage (via DuckDB httpfs or direct download) are currently untracked.

Implementation

Enable Data Access Audit Logs

  1. GCP Console → IAM & Admin → Audit Logs
  2. Find "Cloud Storage" in the service list
  3. Check DATA_READ (and optionally DATA_WRITE)
  4. Save

This captures every storage.objects.get call with: bucket, object path, caller identity, timestamp, IP.

View Logs

Cloud Logging → Logs Explorer:

resource.type="gcs_bucket"
protoPayload.methodName="storage.objects.get"

Long-term Export to BigQuery

Cloud Logging → Log Router → Create Sink:

  • Destination: BigQuery dataset (e.g., calcofi_logs.gcs_access)
  • Filter: resource.type="gcs_bucket"

Enables SQL queries over access patterns, e.g., daily unique users, most-accessed datasets.

Note

Audit logging incurs small GCP costs proportional to volume. Review pricing before enabling on high-traffic buckets.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions