Skip to content

luiul/dbts

Repository files navigation

dbts

PyPI Python CI License: MIT

dbt environment runner with Snowflake zero-copy clone sandboxes.

Run dbt against a private, per-developer Snowflake clone of staging or live without managing a separate profile or CLI — and use the same tool as a single front for your shared dev, staging, and live targets.

Install

uv tool install dbts
# or run ad hoc:
uvx dbts ...

Quick start

  1. Add a sandbox: target to ~/.dbt/profiles.yml (alongside your existing dev, staging, live). The database name must match the pattern <PREFIX>_SANDBOX_<USER>:

    <your_profile_name>:           # whatever your dbt_project.yml's profile: field says
      outputs:
        sandbox:
          type: snowflake
          account: <same as the other targets>
          user: <same as the other targets>
          role: <same as the other targets>
          authenticator: externalbrowser
          database: <prefix>_sandbox_<your_username>
          warehouse: <same as the other targets>
          schema: <same as the other targets>
  2. Create your clone:

    dbts up --from staging
  3. Run dbt against it:

    dbts build my_model
    dbts test  +my_model+
  4. Refresh or drop when done:

    dbts refresh --from staging
    dbts drop

Commands

Sandbox — manage the per-developer zero-copy clone:

dbts up        --from staging|live         create the clone
dbts refresh   --from staging|live         drop and re-create the clone
dbts status                                show clone DB, source, age, owner
dbts drop                                  drop the clone

Inspect — preview / audit before (or after) a build:

dbts plan      [selectors...]              preview the build set
  --cost                                     estimate credits + runtime from QUERY_HISTORY
  --days N                                   QUERY_HISTORY lookback window (default 7, max 365)
dbts freshness [selectors...]              audit lineage; flag stale tables, suggest a rebuild
  --since                                    explicit threshold (24h, 7d, 1w, or ISO datetime)

dbt pass-through — forwarded to dbt with --target sandbox by default:

dbts run|build|test|compile|seed|snapshot|ls|show
dbts debug|deps|source|docs|parse|clean
  --target sandbox|staging|live|dev          choose target (-t)

Meta:

dbts version                               print installed version

Global flags: -v / --verbose (debug logging, including DDL), -q / --quiet (warnings only). -h is a shorthand for --help on every command. Run dbts <command> --help for the full option list.

Selectors

Bare positional model selectors work the same as in dbt:

dbts build my_model+              # forwarded as `--select my_model+`
dbts test +my_model+              # ancestors and descendants
dbts run a b c+                   # multi-selector union
dbts build my_model+ --exclude experiments
dbts build --select a b           # `--select` + bare positional are merged

Bare positional args on run / build / test / compile / seed / snapshot / ls / show are promoted to --select before being forwarded. Other subcommands (debug, deps, docs, parse, clean, source) pass arguments through verbatim.

Previewing a build

dbts plan lists exactly which models a dbts build (or run/test/...) with the same selectors would touch. Useful when a build against --target live blows up halfway and you need to add --exclude rules.

dbts plan my_model+ another_model+ --target live           # offline, fast
dbts plan --select tag:slow --exclude path:models/intermediate
dbts plan my_model+ --target live --cost                   # + Snowflake cost breakdown

By default the command is offline — it parses the dbt project but never connects to Snowflake. Output groups models by directory and shows materialization, tags, and parent count. The footer prints suggested --exclude path:<dir> and --exclude tag:<name> snippets sized by how many models each would prune.

Pass --cost to also estimate Snowflake credits and runtime from SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY. Each row gets median run and last seen columns; the footer adds total credits + USD for an incremental vs full refresh, plus a top-5 most expensive list. Default lookback is 7 days (--days N, max 365).

Cost estimates require a structured query_tag with a model field (HelloFresh's set_query_tag macro is one example) and read access to SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY. USD uses $3.00/credit by default; override with $DBTS_CREDIT_RATE. If access is missing, dbts plan falls back to the offline output.

Auditing freshness after an incident

dbts freshness answers "did data flow correctly through this lineage?" by reading INFORMATION_SCHEMA.TABLES.LAST_ALTERED for every table in the selected build set, sorting topologically (parents before children), and highlighting stale links in red.

# Did the catch-up run reach everything downstream of these models? (defaults to --target live)
dbts freshness base__recipe__cps+ base__recipe__ingredient__cps+

# Audit the full lineage in both directions
dbts freshness +base__recipe__cps+

# Compare against an explicit incident timestamp
dbts freshness base__recipe__cps+ --since '2026-05-09 17:00'

# Audit your sandbox clone instead
dbts freshness base__recipe__cps+ --target sandbox

Default target: live (unlike build / plan, which default to sandbox) — the typical post-incident question is "did production catch up?". Pass --target sandbox|staging|dev to audit elsewhere. The command resolves the target to the physical database name (<DB>_SANDBOX_<USER>, <DB>_STAGING, <DB>_DEV, or unsuffixed for live), mirroring the project's generate_database_name macro.

How "stale" is decided

Two values drive the colored output:

  • Baseline — the most recent LAST_ALTERED across all tables in the build set. It's the "freshest thing you have right now," shown in the header for context.
  • Threshold — anything older than this is flagged as stale (red row).

Without --since, the threshold is adaptive: baseline − 6 hours. The threshold drifts with whatever's currently fresh, so you don't have to know what "fresh" means today.

dbts freshness recipe+
# baseline = 14:02 (freshest table)
# threshold = 08:02   (anything older is stale)

With --since, the threshold is explicit and the adaptive window is ignored. Useful when you know exactly when something broke:

dbts freshness recipe+ --since '2026-05-09 17:00'   # absolute (ISO datetime or date)
dbts freshness recipe+ --since 24h                  # relative (s, m, h, d, w)

The footer prints a copy-pasteable dbts build --select X+ line that covers the minimum set of stale-roots needed to catch the chain back up — building any model that's already fresh upstream is avoided.

What LAST_ALTERED actually means

The signal is Snowflake's INFORMATION_SCHEMA.TABLES.LAST_ALTERED, which dbt bumps on every INSERT, MERGE, UPDATE, or CREATE OR REPLACE. "Fresh" means "dbt touched it recently," not "new rows arrived" — exactly the right semantic for the "did the chain re-execute?" question. Requires read access to INFORMATION_SCHEMA.TABLES.

Project-side coupling

dbts assumes the dbt project's generate_database_name macro recognises ENV=sandbox and routes models into a _SANDBOX_<USER> suffixed database. See the dbt project's README for the macro snippet.

Profile resolution

dbts resolves the dbt profile name in this order:

  1. $DBTS_PROFILE if set.
  2. The profile: field in dbt_project.yml at the project root.

Jinja {{ env_var('NAME', 'default') }} calls in the profile: field are rendered against the current environment, so projects whose profile name is templated (e.g. tardis_{{ env_var('warehouse', 'snowflake') }}) work out of the box.

Development

uv sync --group dev
uv run pytest
prek install   # one-time, runs ruff + ty on every commit

prek (Astral's Rust port of pre-commit) runs the hooks in .pre-commit-config.yaml. Install via brew install prek or uv tool install prek; the standard pre-commit binary also works.

Cutting a release (after moving [Unreleased] entries under a [X.Y.Z] heading in CHANGELOG.md):

./scripts/release.sh 0.6.0

The script bumps pyproject.toml, syncs the lockfile, runs checks, commits, tags, pushes, and creates the GitHub release from CHANGELOG.md. See scripts/README.md for the full workflow and recovery tips.

About

dbt environment runner with Snowflake zero-copy clone sandboxes

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors