GCP Test Project Setup for BigQuery Integration Tests

Status: Reference / runbook Audience: sqlanvil contributors who need to run the BigQuery integration tests (//cli:index_run_e2e_test, //tests/integration:bigquery.spec) against a real BigQuery instance.

0. TL;DR

# One-time setup (~30 minutes):
export PROJECT_ID="sqlanvil-test-$(openssl rand -hex 3)"

gcloud projects create "$PROJECT_ID" --name="sqlanvil test"
gcloud config set project "$PROJECT_ID"
gcloud services enable bigquery.googleapis.com storage.googleapis.com

gcloud iam service-accounts create sqlanvil-test-runner \
  --display-name="sqlanvil integration test runner"

gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:sqlanvil-test-runner@${PROJECT_ID}.iam.gserviceaccount.com" \
  --role=roles/bigquery.dataEditor

gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:sqlanvil-test-runner@${PROJECT_ID}.iam.gserviceaccount.com" \
  --role=roles/bigquery.jobUser

mkdir -p test_credentials
gcloud iam service-accounts keys create test_credentials/bigquery.json \
  --iam-account="sqlanvil-test-runner@${PROJECT_ID}.iam.gserviceaccount.com"

# Per-shell (or .envrc). Optional — the project is auto-detected from
# test_credentials/bigquery.json if unset; the test location is hardcoded to "US".
export SA_TEST_BIGQUERY_PROJECT="$PROJECT_ID"

# Verify:
./scripts/docker-bazel test //tests/integration:bigquery.spec

1. Why a dedicated project

Blast radius. Bugs in test code could drop datasets. Keep that blast radius off any project that holds real data.
Billing isolation. Easier to detect runaway test costs when the project has zero other workload.
Service-account scoping. Least-privilege roles only — no admin access leaks if a key is ever exposed.
Cleanup. Tearing down a project deletes everything; no orphans.

2. Cost expectations

BigQuery free tier covers all test usage comfortably:

Resource	Free tier	Test usage
Storage	10 GB/month	<10 MB
Query scans	1 TB/month	<100 MB per test run
Streaming inserts	Not free	Tests don't use streaming

Expected monthly bill: $0 for normal contributor cadence (a few hundred test runs/month).

Set a billing budget alert at $5/mo as a safety net (see §5).

3. Setup walkthrough

3.1. Prereqs

gcloud CLI installed and authenticated (gcloud auth login)
Either a personal Google account OR a billing account that allows free-tier-only projects
Optional but recommended: direnv for per-repo env var loading

3.2. Create the project

Pick a unique project ID. GCP project IDs are globally unique and permanent — use a random suffix:

PROJECT_ID="sqlanvil-test-$(openssl rand -hex 3)"
echo "$PROJECT_ID"  # e.g. sqlanvil-test-a3f7c9

gcloud projects create "$PROJECT_ID" --name="sqlanvil test"
gcloud config set project "$PROJECT_ID"

3.3. Link billing (required even for free-tier-only usage)

GCP requires a billing account on any project that uses APIs, even if all usage stays in the free tier. List your billing accounts:

gcloud billing accounts list

Link one:

gcloud billing projects link "$PROJECT_ID" \
  --billing-account=BILLING_ACCOUNT_ID

3.4. Enable APIs

gcloud services enable \
  bigquery.googleapis.com \
  storage.googleapis.com

Storage is needed because some integration tests stage files in GCS.

3.5. Create the service account

gcloud iam service-accounts create sqlanvil-test-runner \
  --display-name="sqlanvil integration test runner"

3.6. Grant minimum roles

Do NOT use roles/owner or roles/editor. Stick to least privilege:

SA="sqlanvil-test-runner@${PROJECT_ID}.iam.gserviceaccount.com"

# Read/write BQ tables in this project
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:${SA}" \
  --role=roles/bigquery.dataEditor

# Run BQ jobs (queries, loads) in this project
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:${SA}" \
  --role=roles/bigquery.jobUser

If integration tests touch GCS (some notebook tests do):

# Read/write only — no bucket admin
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:${SA}" \
  --role=roles/storage.objectUser

3.7. Download the service-account key

mkdir -p test_credentials
gcloud iam service-accounts keys create test_credentials/bigquery.json \
  --iam-account="$SA"

Verify:

test -f test_credentials/bigquery.json && echo "OK" || echo "MISSING"

test_credentials/*.json is in .gitignore — confirm:

git status --porcelain test_credentials/bigquery.json | head
# (no output = ignored, good)

3.8. Set local env vars

cli/index_test_base.ts reads SA_TEST_BIGQUERY_PROJECT — but only as an override. If it's unset, the project is taken from test_credentials/bigquery.json's projectId, so for most setups you don't need to export anything. The test location is hardcoded to US.

# Optional override (otherwise read from test_credentials/bigquery.json):
export SA_TEST_BIGQUERY_PROJECT="$PROJECT_ID"

To persist, add it to your shell profile or a direnv .envrc:

# .envrc (used by direnv)
export SA_TEST_BIGQUERY_PROJECT="sqlanvil-test-a3f7c9"

3.9. Verify

./scripts/docker-bazel test //tests/integration:bigquery.spec

Expected: PASSED. If it still fails, see §6.

4. CI integration (GitHub Actions)

4.1. Store the key as a secret

GitHub repo → Settings → Secrets and variables → Actions → New repository secret:

Secret name	Value
`SA_TEST_BIGQUERY_KEY`	Contents of `test_credentials/bigquery.json` (the full JSON, multi-line)
`SA_TEST_BIGQUERY_PROJECT`	The project ID (e.g. `sqlanvil-test-a3f7c9`) — optional if the key's `projectId` is correct

4.2. Workflow snippet

# .github/workflows/test.yml
name: test

on: [push, pull_request]

jobs:
  bq-integration:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Write BQ creds
        run: |
          mkdir -p test_credentials
          echo '${{ secrets.SA_TEST_BIGQUERY_KEY }}' > test_credentials/bigquery.json

      - name: Test
        env:
          SA_TEST_BIGQUERY_PROJECT: ${{ secrets.SA_TEST_BIGQUERY_PROJECT }}
        run: ./scripts/docker-bazel test //tests/integration:bigquery.spec

4.3. Forks

PRs from forks don't receive secrets by default. This is correct — you don't want an attacker submitting a PR that exfiltrates your BQ key. Either:

Skip BQ integration tests on fork PRs (mark them continue-on-error: true)
Run BQ tests only on pull_request_target events from trusted contributors
Run BQ tests post-merge against main only

Recommend: skip on fork PRs, run on merge to main.

5. Hardening

5.1. Billing budget alert

# Replace with your billing account ID
gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="sqlanvil-test budget" \
  --budget-amount=5 \
  --threshold-rule=percent=50 \
  --threshold-rule=percent=100 \
  --filter-projects="projects/${PROJECT_ID}"

Get email alerts at $2.50 (50%) and $5 (100%).

5.2. Daily BQ quota cap

In the GCP console: IAM & Admin → Quotas → Filter "BigQuery API" → Query usage per day per project → set to 100 GiB.

Hard limit. Worst-case impact is hours-of-lost-testing rather than weeks-of-billing.

5.3. Key rotation

Service-account keys never expire by default. Rotate annually:

# List existing keys
gcloud iam service-accounts keys list \
  --iam-account="$SA"

# Create new key
gcloud iam service-accounts keys create test_credentials/bigquery.json \
  --iam-account="$SA"

# Delete old keys (record the KEY_ID from `list` above)
gcloud iam service-accounts keys delete OLD_KEY_ID \
  --iam-account="$SA"

Update the GitHub secret with the new key.

5.4. Test isolation

Tests should generate unique dataset names so parallel runs don't collide. Pattern (already used in some tests):

const DATASET = `sqlanvil_test_${process.env.BUILD_ID || Date.now()}`;

Tear down at end:

afterAll(async () => {
  await client.dataset(DATASET).delete({ force: true });
});

6. Troubleshooting

`Missing credentials JSON file; not found at path 'test_credentials/bigquery.json'`

The key file isn't where the tests look. Check:

ls -la test_credentials/bigquery.json

Bazel runs in a sandbox — the file needs to be declared as test data. That declaration is in cli/BUILD:

data = [
    "//test_credentials:bigquery.json",
    ...
]

If test_credentials/BUILD doesn't exist, create it:

cat > test_credentials/BUILD <<'EOF'
package(default_visibility = ["//visibility:public"])
exports_files(["bigquery.json"])
EOF

`Permission denied` on a BQ operation

Service account is missing a role. Check current bindings:

gcloud projects get-iam-policy "$PROJECT_ID" \
  --flatten="bindings[].members" \
  --filter="bindings.members:${SA}"

Expected: roles/bigquery.dataEditor, roles/bigquery.jobUser (and roles/storage.objectUser if storage tests touched).

`BigQuery: Not found: Dataset xxx`

Tests need to create datasets they reference. If a test asserts against a pre-existing dataset, you need to seed it once:

bq mk --dataset --location=US "$PROJECT_ID:sqlanvil_test"

Builds in CI succeed but local fails

Likely env var difference. Compare:

env | grep SA_TEST_BIGQUERY

vs the GitHub Actions env. Common cause: SA_TEST_BIGQUERY_PROJECT (or the projectId in test_credentials/bigquery.json) points at a different project locally than in CI.

7. Tear-down

When you no longer need the project:

gcloud projects delete "$PROJECT_ID"

This schedules the project for deletion (30-day grace period). All billing stops immediately, but you can restore within 30 days:

gcloud projects undelete "$PROJECT_ID"

After 30 days, the project ID becomes unrecoverable (and reusable by anyone — including squatters).

8. What this fixes

Failing test (before setup)	After setup
`//cli:index_run_e2e_test`	✅ passes
`//tests/integration:bigquery.spec`	✅ passes
`//tests/api:projects.spec`	❌ still fails — unrelated pre-existing bug (schema mismatch on `warehouse` property); see follow-up PR

9. References

cli/index_test_base.ts — where the constants are read
.gitignore — keeps test_credentials/*.json out of commits
docs/postgres_first_class_design.md §6 Phase 6 — Postgres integration test plan (parallel docker-driven fixture, no GCP)
BigQuery free tier
Service-account key best practices

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCP Test Project Setup for BigQuery Integration Tests

0. TL;DR

1. Why a dedicated project

2. Cost expectations

3. Setup walkthrough

3.1. Prereqs

3.2. Create the project

3.3. Link billing (required even for free-tier-only usage)

3.4. Enable APIs

3.5. Create the service account

3.6. Grant minimum roles

3.7. Download the service-account key

3.8. Set local env vars

3.9. Verify

4. CI integration (GitHub Actions)

4.1. Store the key as a secret

4.2. Workflow snippet

4.3. Forks

5. Hardening

5.1. Billing budget alert

5.2. Daily BQ quota cap

5.3. Key rotation

5.4. Test isolation

6. Troubleshooting

`Missing credentials JSON file; not found at path 'test_credentials/bigquery.json'`

`Permission denied` on a BQ operation

`BigQuery: Not found: Dataset xxx`

Builds in CI succeed but local fails

7. Tear-down

8. What this fixes

9. References

FilesExpand file tree

gcp_test_project_setup.md

Latest commit

History

gcp_test_project_setup.md

File metadata and controls

GCP Test Project Setup for BigQuery Integration Tests

0. TL;DR

1. Why a dedicated project

2. Cost expectations

3. Setup walkthrough

3.1. Prereqs

3.2. Create the project

3.3. Link billing (required even for free-tier-only usage)

3.4. Enable APIs

3.5. Create the service account

3.6. Grant minimum roles

3.7. Download the service-account key

3.8. Set local env vars

3.9. Verify

4. CI integration (GitHub Actions)

4.1. Store the key as a secret

4.2. Workflow snippet

4.3. Forks

5. Hardening

5.1. Billing budget alert

5.2. Daily BQ quota cap

5.3. Key rotation

5.4. Test isolation

6. Troubleshooting

Missing credentials JSON file; not found at path 'test_credentials/bigquery.json'

Permission denied on a BQ operation

BigQuery: Not found: Dataset xxx

Builds in CI succeed but local fails

7. Tear-down

8. What this fixes

9. References

`Missing credentials JSON file; not found at path 'test_credentials/bigquery.json'`

`Permission denied` on a BQ operation

`BigQuery: Not found: Dataset xxx`