From 3f2c071f53ca1fe424af3c2babff09c90b437887 Mon Sep 17 00:00:00 2001 From: James Broadhead Date: Tue, 28 Apr 2026 15:17:16 +0000 Subject: [PATCH 1/5] lakebase: document SP grant for AppKit/CRUD apps and `databricks psql` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the SQL block (with `databricks_create_role` + DML grants) needed once after the first deploy of an AppKit Lakebase app — without it the SP gets `password authentication failed`. Promotes `databricks psql` as the runnable form of #56's manual `generate-database-credential` recipe. Also flags that the `databricks postgres create-role` CLI rejects every SP-role payload, so agents stop trying to use it. Co-authored-by: Isaac --- .../references/appkit/lakebase.md | 3 ++ skills/databricks-lakebase/SKILL.md | 47 ++++++++++++++----- 2 files changed, 37 insertions(+), 13 deletions(-) diff --git a/skills/databricks-apps/references/appkit/lakebase.md b/skills/databricks-apps/references/appkit/lakebase.md index c7ad167..acf5247 100644 --- a/skills/databricks-apps/references/appkit/lakebase.md +++ b/skills/databricks-apps/references/appkit/lakebase.md @@ -246,6 +246,8 @@ Check the response for the `active_deployment` field. If it exists with `status. If you skip this step, the Service Principal won't own the database schema. You'll create schemas under your credentials that the SP **cannot access** after deployment. See **`databricks-lakebase`** skill's **Schema Permissions for Deployed Apps** for the full workflow and recovery steps. +> **First deploy with `lakebase`:** the SP also needs a Postgres role created via `databricks_create_role()` before it can connect at all — otherwise the deployed app fails with `password authentication failed for user ''`. Run the **Grant app SP for AppKit / CRUD apps** SQL block from the **`databricks-lakebase`** skill once after the first deploy, then restart the app. + The Lakebase env vars (`PGHOST`, `PGDATABASE`, etc.) are auto-set only when deployed. For local development, get the connection details from your endpoint and set them manually: ```bash @@ -276,5 +278,6 @@ Load `server/.env` in your dev server (e.g. via `dotenv` or `node --env-file=ser | `permission denied for schema ` | Schema was created by another role (e.g. you ran locally before deploying) | **Ask the user before dropping** — `DROP SCHEMA` deletes all data. See **`databricks-lakebase`** skill's **Schema Permissions for Deployed Apps** for options | | Works locally but `permission denied` after deploy | Local credentials created the schema; the SP can't access schemas it doesn't own | **Ask the user before dropping** — warn about data loss, then deploy first. See **`databricks-lakebase`** skill's **Schema Permissions for Deployed Apps** for options | | `connection refused` | Pool not connected or wrong env vars | Check `PGHOST`, `PGPORT`, `LAKEBASE_ENDPOINT` are set | +| `password authentication failed for user ''` | SP has no Postgres role on the branch (first deploy with `lakebase` plugin) | Run the **Grant app SP for AppKit / CRUD apps** SQL block from the **`databricks-lakebase`** skill, then restart the app | | `relation "X" does not exist` | Tables not initialized | Run `CREATE TABLE IF NOT EXISTS` at startup | | App builds but pool fails at runtime | Env vars not set locally | Set vars in `server/.env` — see Local Development above | diff --git a/skills/databricks-lakebase/SKILL.md b/skills/databricks-lakebase/SKILL.md index fba9381..c96a679 100644 --- a/skills/databricks-lakebase/SKILL.md +++ b/skills/databricks-lakebase/SKILL.md @@ -211,22 +211,19 @@ databricks postgres create-endpoint projects//branches/ < ``` **Run SQL against Lakebase** (GRANT, CREATE INDEX, etc.): -```bash -# 1. Get endpoint host -databricks postgres get-endpoint projects//branches//endpoints/ --profile -# 2. Generate OAuth token -databricks postgres generate-database-credential \ - projects//branches//endpoints/ \ - --profile +Preferred — `databricks psql` wrapper handles auth, host discovery, and TLS in one call: +```bash +databricks psql --project --branch --endpoint \ + -- -d databricks_postgres -f path/to/script.sql --profile -# 3. Connect (use token from step 2 as password, host from step 1) -PGPASSWORD='' psql "host= user= dbname=databricks_postgres sslmode=require" +# One-off statement +databricks psql --project -- -d databricks_postgres -c "SELECT 1" --profile ``` -> **Note:** `generate-database-credential` requires the **endpoint** resource path (`.../endpoints/`), not a database or branch path. +Requires `psql` on `PATH` (the wrapper shells out to it). Branch/endpoint default to the only one when there is just one. -**Scriptable version** (single copy-paste, useful for agents): +Manual form (use when the wrapper isn't available): ```bash EP=projects//branches//endpoints/ # get-endpoint JSON shape: {"status": {"hosts": {"host": ""}, ...}, ...} @@ -237,13 +234,35 @@ TOKEN=$(databricks postgres generate-database-credential $EP --profile PGPASSWORD="$TOKEN" psql "host=$HOST user= dbname=databricks_postgres sslmode=require" ``` -**Grant app SP access to synced tables** (run as project owner after sync is ONLINE and app is deployed): +> **Note:** `generate-database-credential` requires the **endpoint** resource path (`.../endpoints/`), not a database or branch path. + +**Grant app SP access to synced tables** (read-only) — run as project owner after sync is ONLINE and app is deployed: ```sql GRANT USAGE ON SCHEMA public TO ""; GRANT SELECT ON ALL TABLES IN SCHEMA public TO ""; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO ""; ``` -For least-privilege, consider syncing into a dedicated schema instead of `public` so the grant is scoped to synced data only. + +**Grant app SP for AppKit / CRUD apps** (full DML) — run **once after the first deploy** of any app whose `lakebase` plugin owns its own schema. Without this the app fails to connect with `password authentication failed for user ''` because the SP has no Postgres role yet: +```sql +CREATE EXTENSION IF NOT EXISTS databricks_auth; + +DO $$ +DECLARE + sp TEXT := ''; -- from `databricks apps get -o json | jq -r .service_principal_client_id` +BEGIN + PERFORM databricks_create_role(sp, 'SERVICE_PRINCIPAL'); + EXECUTE format('GRANT CONNECT ON DATABASE "databricks_postgres" TO %I', sp); + EXECUTE format('GRANT ALL ON SCHEMA public TO %I', sp); + EXECUTE format('GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO %I', sp); + EXECUTE format('GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO %I', sp); + EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO %I', sp); + EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO %I', sp); +END $$; +``` +Pipe through `databricks psql` (above) — no UI step required. The grant is idempotent; re-running is safe. + +> **Footgun:** `databricks postgres create-role` (the CLI) cannot create the SP role — every payload shape returns `Field 'role' is required and must contain at least one subfield with a non-default value`. Use the `databricks_create_role()` SQL function above instead. Get SP client ID: `databricks apps get --profile ` → `service_principal_client_id` field. @@ -257,6 +276,8 @@ Get SP client ID: `databricks apps get --profile ` → `serv | `cannot configure default credentials` | Use `--profile` flag or authenticate first | | `PERMISSION_DENIED` | Check workspace permissions | | `permission denied for schema` | Schema owned by another role. Deploy app first so SP creates/owns it | +| `password authentication failed for user ''` (deployed app) | SP has no Postgres role on the branch yet. Run the **Grant app SP for AppKit / CRUD apps** SQL block above, then restart the app | +| `Field 'role' is required` from `databricks postgres create-role` | The CLI cannot create SP roles. Use the `databricks_create_role()` SQL function over `databricks psql` instead — see **Grant app SP for AppKit / CRUD apps** | | Protected branch won't delete | `update-branch` to set `spec.is_protected` to `false` first | | Long-running operation timeout | Use `--no-wait` and poll with `get-operation` | | Token expired during long query | Tokens expire after 1 hour; implement refresh (see [connectivity.md](references/connectivity.md)) | From 97a1949792a551d0b16c148a3ca9481f3650cdd3 Mon Sep 17 00:00:00 2001 From: James Broadhead Date: Tue, 28 Apr 2026 15:30:50 +0000 Subject: [PATCH 2/5] =?UTF-8?q?lakebase:=20correct=20CLI=20footgun=20frami?= =?UTF-8?q?ng=20=E2=80=94=20show=20working=20create-role=20form?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The original commit told agents the `databricks postgres create-role` CLI couldn't create SP roles. That was wrong — the CLI works with `--json '{"spec": {...}}'` (fields go on the inner Role object, not wrapped under `{"role": ...}`). The "Field 'role' is required" error fires when the inner Role has no recognized fields, which happens when the body is wrapped — the CLI strips `role` as unknown and ships an empty body. Show the working CLI form alongside the SQL block (`databricks_create_role()` is still preferable in practice because it bundles role creation + grants into one psql round-trip), and rewrite the troubleshooting row to point at the wrapping confusion instead of saying the CLI is broken. Also calls out that the CLI doesn't yet expose convenience flags for nested spec fields (TODOs in cmd/workspace/postgres/postgres.go) — that is the real gap, and a separate CLI PR is appropriate. Co-authored-by: Isaac --- skills/databricks-lakebase/SKILL.md | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/skills/databricks-lakebase/SKILL.md b/skills/databricks-lakebase/SKILL.md index c96a679..8c58552 100644 --- a/skills/databricks-lakebase/SKILL.md +++ b/skills/databricks-lakebase/SKILL.md @@ -243,7 +243,19 @@ GRANT SELECT ON ALL TABLES IN SCHEMA public TO ""; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO ""; ``` -**Grant app SP for AppKit / CRUD apps** (full DML) — run **once after the first deploy** of any app whose `lakebase` plugin owns its own schema. Without this the app fails to connect with `password authentication failed for user ''` because the SP has no Postgres role yet: +**Grant app SP for AppKit / CRUD apps** (full DML) — run **once after the first deploy** of any app whose `lakebase` plugin owns its own schema. Without this the app fails to connect with `password authentication failed for user ''` because the SP has no Postgres role yet. + +Two steps: create the SP's Postgres role, then grant it DML. + +Step 1 — create the role. Either via CLI: +```bash +databricks postgres create-role projects//branches/ \ + --role-id \ + --json '{"spec":{"identity_type":"SERVICE_PRINCIPAL","postgres_role":"","auth_method":"LAKEBASE_OAUTH_V1","membership_roles":["DATABRICKS_SUPERUSER"]}}' \ + --profile +``` + +Or, equivalently, as the first statement in the SQL block below — `databricks_create_role()` does the same thing, lets you bundle role creation and grants into one `psql` round-trip, and is the form the AppKit Lakebase docs use: ```sql CREATE EXTENSION IF NOT EXISTS databricks_auth; @@ -260,9 +272,9 @@ BEGIN EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO %I', sp); END $$; ``` -Pipe through `databricks psql` (above) — no UI step required. The grant is idempotent; re-running is safe. +Pipe through `databricks psql` (above) — no UI step required. Both forms are idempotent; re-running is safe. -> **Footgun:** `databricks postgres create-role` (the CLI) cannot create the SP role — every payload shape returns `Field 'role' is required and must contain at least one subfield with a non-default value`. Use the `databricks_create_role()` SQL function above instead. +> **CLI body shape.** `databricks postgres create-role`'s `--json` flag binds to the inner `Role` object — fields go directly under `spec`, **not** wrapped in `{"role": ...}`. The error `Field 'role' is required and must contain at least one subfield with a non-default value` means the inner Role had no recognized fields (often because someone wrapped the body, which the CLI strips with `Warning: unknown field: role` and ships an empty body). The CLI also doesn't yet expose convenience flags like `--spec.identity-type` ([cmd/workspace/postgres/postgres.go](https://github.com/databricks/cli/blob/main/cmd/workspace/postgres/postgres.go) marks `spec` as TODO), so you must hand-craft the JSON. Get SP client ID: `databricks apps get --profile ` → `service_principal_client_id` field. @@ -277,7 +289,7 @@ Get SP client ID: `databricks apps get --profile ` → `serv | `PERMISSION_DENIED` | Check workspace permissions | | `permission denied for schema` | Schema owned by another role. Deploy app first so SP creates/owns it | | `password authentication failed for user ''` (deployed app) | SP has no Postgres role on the branch yet. Run the **Grant app SP for AppKit / CRUD apps** SQL block above, then restart the app | -| `Field 'role' is required` from `databricks postgres create-role` | The CLI cannot create SP roles. Use the `databricks_create_role()` SQL function over `databricks psql` instead — see **Grant app SP for AppKit / CRUD apps** | +| `Field 'role' is required` from `databricks postgres create-role` | `--json` binds to the inner `Role`. Pass fields directly under `spec` (no `{"role": ...}` wrapper). See the CLI body-shape note in **Grant app SP for AppKit / CRUD apps** | | Protected branch won't delete | `update-branch` to set `spec.is_protected` to `false` first | | Long-running operation timeout | Use `--no-wait` and poll with `get-operation` | | Token expired during long query | Tokens expire after 1 hour; implement refresh (see [connectivity.md](references/connectivity.md)) | From d3f3fdadfbd7128e5b12d9fd696b868e41e5f210 Mon Sep 17 00:00:00 2001 From: James Broadhead Date: Tue, 28 Apr 2026 15:51:23 +0000 Subject: [PATCH 3/5] apps+lakebase: declare-resource is the primary path; manual SQL is fallback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reframes the SP-grant guidance around the actual root cause: the Apps platform auto-creates the SP's Postgres role on deploy when the app declares a `database` resource (via `--set lakebase.postgres.branch=…` and `--set lakebase.postgres.database=…` at init time, materialized as a `database:` block in the app's `resources:` in `databricks.yml`). The manual `databricks_create_role()` SQL block from the prior commit moves to a fallback for shared/pre-existing Lakebases where the resource form isn't usable. Adds a generic rule to `databricks-apps/SKILL.md` (Step 3 after init): verify the resources block in `databricks.yml` contains an entry for every required resource from every plugin in the manifest. Same shape of error appears for missing sql_warehouse (403) and missing genie_space (CAN_RUN denied), not just lakebase — so this lives in the apps skill, not the lakebase one. Cross-refs and troubleshooting rows in `appkit/lakebase.md` updated to point at the resource fix first, manual SQL second. Co-authored-by: Isaac --- skills/databricks-apps/SKILL.md | 2 ++ .../references/appkit/lakebase.md | 4 +-- skills/databricks-lakebase/SKILL.md | 26 ++++++++++--------- 3 files changed, 18 insertions(+), 14 deletions(-) diff --git a/skills/databricks-apps/SKILL.md b/skills/databricks-apps/SKILL.md index 515fa9c..e595cb6 100644 --- a/skills/databricks-apps/SKILL.md +++ b/skills/databricks-apps/SKILL.md @@ -156,6 +156,8 @@ npx @databricks/appkit docs ./docs/plugins/analytics.md # example: specific doc **DO NOT guess** plugin names, resource keys, or property names — always derive them from `databricks apps manifest` output. Example: if the manifest shows plugin `analytics` with a required resource `resourceKey: "sql-warehouse"` and `fields: { "id": ... }`, include `--set analytics.sql-warehouse.id=`. +3. **Verify resources after init.** Open `databricks.yml` and confirm `resources.apps..resources` contains a block for **every** required resource from every plugin you included (manifest's `resources.required`). For example, `--features analytics,genie,lakebase` must produce three blocks: `sql_warehouse`, `genie_space`, **and** `database`. A missing resource means the Apps platform won't grant the SP access to that resource at deploy time, and the app will fail at runtime — typically with `password authentication failed for user ''` (Lakebase), `403` from the SQL warehouse, or `CAN_RUN denied` (Genie). Fix by re-running `init` with the missing `--set` flag, not by hand-editing the YAML — the YAML is a generated artifact and your edit will be lost the next time someone re-scaffolds. + **READ [AppKit Overview](references/appkit/overview.md)** for project structure, workflow, and pre-implementation checklist. **Genie Agent Workflow** — when the user wants a Genie-powered app, do **not** start by asking for a Genie Space ID. Instead: diff --git a/skills/databricks-apps/references/appkit/lakebase.md b/skills/databricks-apps/references/appkit/lakebase.md index acf5247..70ba84d 100644 --- a/skills/databricks-apps/references/appkit/lakebase.md +++ b/skills/databricks-apps/references/appkit/lakebase.md @@ -246,7 +246,7 @@ Check the response for the `active_deployment` field. If it exists with `status. If you skip this step, the Service Principal won't own the database schema. You'll create schemas under your credentials that the SP **cannot access** after deployment. See **`databricks-lakebase`** skill's **Schema Permissions for Deployed Apps** for the full workflow and recovery steps. -> **First deploy with `lakebase`:** the SP also needs a Postgres role created via `databricks_create_role()` before it can connect at all — otherwise the deployed app fails with `password authentication failed for user ''`. Run the **Grant app SP for AppKit / CRUD apps** SQL block from the **`databricks-lakebase`** skill once after the first deploy, then restart the app. +> **First deploy with `lakebase`:** confirm `databricks.yml` declares a `database` resource on the app (alongside `sql_warehouse`, `genie_space`, etc.). Apps platform auto-creates the SP's Postgres role only when the database is attached as an app resource — without it, the deployed app fails with `password authentication failed for user ''`. If the resource is missing, re-run `databricks apps init` with `--set lakebase.postgres.branch=...` and `--set lakebase.postgres.database=...`; if you can't (shared Lakebase, custom permissions), use the manual SQL fallback in the **`databricks-lakebase`** skill's **Grant app SP for AppKit / CRUD apps** section. The Lakebase env vars (`PGHOST`, `PGDATABASE`, etc.) are auto-set only when deployed. For local development, get the connection details from your endpoint and set them manually: @@ -278,6 +278,6 @@ Load `server/.env` in your dev server (e.g. via `dotenv` or `node --env-file=ser | `permission denied for schema ` | Schema was created by another role (e.g. you ran locally before deploying) | **Ask the user before dropping** — `DROP SCHEMA` deletes all data. See **`databricks-lakebase`** skill's **Schema Permissions for Deployed Apps** for options | | Works locally but `permission denied` after deploy | Local credentials created the schema; the SP can't access schemas it doesn't own | **Ask the user before dropping** — warn about data loss, then deploy first. See **`databricks-lakebase`** skill's **Schema Permissions for Deployed Apps** for options | | `connection refused` | Pool not connected or wrong env vars | Check `PGHOST`, `PGPORT`, `LAKEBASE_ENDPOINT` are set | -| `password authentication failed for user ''` | SP has no Postgres role on the branch (first deploy with `lakebase` plugin) | Run the **Grant app SP for AppKit / CRUD apps** SQL block from the **`databricks-lakebase`** skill, then restart the app | +| `password authentication failed for user ''` | App's `databricks.yml` is missing a `database` resource — Apps platform never auto-created the SP's Postgres role on attach | Add the missing `database` resource (re-run `databricks apps init` with `--set lakebase.postgres.branch=...` and `--set lakebase.postgres.database=...`), redeploy. Manual SQL fallback: see **`databricks-lakebase`**'s **Grant app SP for AppKit / CRUD apps** | | `relation "X" does not exist` | Tables not initialized | Run `CREATE TABLE IF NOT EXISTS` at startup | | App builds but pool fails at runtime | Env vars not set locally | Set vars in `server/.env` — see Local Development above | diff --git a/skills/databricks-lakebase/SKILL.md b/skills/databricks-lakebase/SKILL.md index 8c58552..c065285 100644 --- a/skills/databricks-lakebase/SKILL.md +++ b/skills/databricks-lakebase/SKILL.md @@ -243,19 +243,13 @@ GRANT SELECT ON ALL TABLES IN SCHEMA public TO ""; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO ""; ``` -**Grant app SP for AppKit / CRUD apps** (full DML) — run **once after the first deploy** of any app whose `lakebase` plugin owns its own schema. Without this the app fails to connect with `password authentication failed for user ''` because the SP has no Postgres role yet. +**Grant app SP for AppKit / CRUD apps** (full DML). -Two steps: create the SP's Postgres role, then grant it DML. +> **First check: is the Lakebase declared as an app resource?** When the Apps platform attaches a `database` resource (declared in the app's `databricks.yml` under `resources.apps..resources`) to an app on deploy, it auto-creates the SP's Postgres role with `CAN_CONNECT_AND_CREATE`. If the SP is failing to connect with `password authentication failed for user ''`, the most likely cause is a missing `database` resource — fix that first, redeploy, and the auto-grant fires. See the `databricks-apps` skill (Scaffolding) for verifying every required plugin resource is declared. +> +> The SQL block below is the **fallback** for cases the resource form doesn't cover: granting access to an existing Lakebase the app spec doesn't own (shared across apps, pre-existing schema with custom permissions, post-hoc grants for additional tables/sequences). -Step 1 — create the role. Either via CLI: -```bash -databricks postgres create-role projects//branches/ \ - --role-id \ - --json '{"spec":{"identity_type":"SERVICE_PRINCIPAL","postgres_role":"","auth_method":"LAKEBASE_OAUTH_V1","membership_roles":["DATABRICKS_SUPERUSER"]}}' \ - --profile -``` - -Or, equivalently, as the first statement in the SQL block below — `databricks_create_role()` does the same thing, lets you bundle role creation and grants into one `psql` round-trip, and is the form the AppKit Lakebase docs use: +Manual fallback — create the role and grant DML, in one psql round-trip: ```sql CREATE EXTENSION IF NOT EXISTS databricks_auth; @@ -272,7 +266,15 @@ BEGIN EXECUTE format('ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO %I', sp); END $$; ``` -Pipe through `databricks psql` (above) — no UI step required. Both forms are idempotent; re-running is safe. +Pipe through `databricks psql` (above). The block is idempotent; re-running is safe. + +The role-creation step alone has a CLI form too (useful when granting privileges separately): +```bash +databricks postgres create-role projects//branches/ \ + --role-id \ + --json '{"spec":{"identity_type":"SERVICE_PRINCIPAL","postgres_role":"","auth_method":"LAKEBASE_OAUTH_V1","membership_roles":["DATABRICKS_SUPERUSER"]}}' \ + --profile +``` > **CLI body shape.** `databricks postgres create-role`'s `--json` flag binds to the inner `Role` object — fields go directly under `spec`, **not** wrapped in `{"role": ...}`. The error `Field 'role' is required and must contain at least one subfield with a non-default value` means the inner Role had no recognized fields (often because someone wrapped the body, which the CLI strips with `Warning: unknown field: role` and ships an empty body). The CLI also doesn't yet expose convenience flags like `--spec.identity-type` ([cmd/workspace/postgres/postgres.go](https://github.com/databricks/cli/blob/main/cmd/workspace/postgres/postgres.go) marks `spec` as TODO), so you must hand-craft the JSON. From 115323398653d39406a7c5fd88454edb497ca8c4 Mon Sep 17 00:00:00 2001 From: James Broadhead Date: Tue, 28 Apr 2026 16:46:54 +0000 Subject: [PATCH 4/5] fix(lakebase): correct --profile placement and security/robustness gaps - The two `databricks psql` examples placed `--profile ` AFTER `--`, so the flag was forwarded to psql and caused `psql: error: unrecognized option`. Move it before `--` and add a short note explaining the separator semantics. - The fallback `databricks postgres create-role` example shipped with `membership_roles: ["DATABRICKS_SUPERUSER"]`, contradicting the least-privilege grant block immediately above it. Remove it from the example and add a least-privilege caveat. - `ALTER DEFAULT PRIVILEGES` without `FOR ROLE` only applies to tables created by the running role, so future synced tables created by the sync pipeline role won't pick up the grant. Add a caveat with both workarounds. - The connectivity.md `resolve_host` snippet would crash with unhandled FileNotFoundError when `dig` is missing. Wrap the subprocess.run call and raise a RuntimeError with installation guidance. Co-authored-by: Isaac --- skills/databricks-lakebase/SKILL.md | 14 +++++--- .../references/connectivity.md | 36 +++++++++++++++++++ 2 files changed, 46 insertions(+), 4 deletions(-) diff --git a/skills/databricks-lakebase/SKILL.md b/skills/databricks-lakebase/SKILL.md index c065285..eba327d 100644 --- a/skills/databricks-lakebase/SKILL.md +++ b/skills/databricks-lakebase/SKILL.md @@ -214,13 +214,15 @@ databricks postgres create-endpoint projects//branches/ < Preferred — `databricks psql` wrapper handles auth, host discovery, and TLS in one call: ```bash -databricks psql --project --branch --endpoint \ - -- -d databricks_postgres -f path/to/script.sql --profile +databricks psql --profile --project --branch --endpoint \ + -- -d databricks_postgres -f path/to/script.sql # One-off statement -databricks psql --project -- -d databricks_postgres -c "SELECT 1" --profile +databricks psql --profile --project -- -d databricks_postgres -c "SELECT 1" ``` +> **`--profile` placement.** All `databricks` flags (including `--profile`) MUST come before the `--` separator. Anything after `--` is forwarded verbatim to `psql`, which doesn't understand `--profile` and will exit with `psql: error: unrecognized option`. + Requires `psql` on `PATH` (the wrapper shells out to it). Branch/endpoint default to the only one when there is just one. Manual form (use when the wrapper isn't available): @@ -243,6 +245,8 @@ GRANT SELECT ON ALL TABLES IN SCHEMA public TO ""; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO ""; ``` +> **Default privileges caveat.** `ALTER DEFAULT PRIVILEGES` without `FOR ROLE` only applies to tables created by the role running this statement. If sync pipelines create new tables under a different role, re-run `GRANT SELECT ON ALL TABLES IN SCHEMA public TO ""` after each new table appears, or add `FOR ROLE ` once you know which role the sync runs as. + **Grant app SP for AppKit / CRUD apps** (full DML). > **First check: is the Lakebase declared as an app resource?** When the Apps platform attaches a `database` resource (declared in the app's `databricks.yml` under `resources.apps..resources`) to an app on deploy, it auto-creates the SP's Postgres role with `CAN_CONNECT_AND_CREATE`. If the SP is failing to connect with `password authentication failed for user ''`, the most likely cause is a missing `database` resource — fix that first, redeploy, and the auto-grant fires. See the `databricks-apps` skill (Scaffolding) for verifying every required plugin resource is declared. @@ -272,10 +276,12 @@ The role-creation step alone has a CLI form too (useful when granting privileges ```bash databricks postgres create-role projects//branches/ \ --role-id \ - --json '{"spec":{"identity_type":"SERVICE_PRINCIPAL","postgres_role":"","auth_method":"LAKEBASE_OAUTH_V1","membership_roles":["DATABRICKS_SUPERUSER"]}}' \ + --json '{"spec":{"identity_type":"SERVICE_PRINCIPAL","postgres_role":"","auth_method":"LAKEBASE_OAUTH_V1"}}' \ --profile ``` +> **Least privilege.** The example creates the role with default privileges only — grant database/schema/table access via the explicit `GRANT` statements above. Don't add `membership_roles: ["DATABRICKS_SUPERUSER"]` for an app SP unless broad administrative access is intentional; superuser membership lets the app role read every Lakebase database, not just its own. + > **CLI body shape.** `databricks postgres create-role`'s `--json` flag binds to the inner `Role` object — fields go directly under `spec`, **not** wrapped in `{"role": ...}`. The error `Field 'role' is required and must contain at least one subfield with a non-default value` means the inner Role had no recognized fields (often because someone wrapped the body, which the CLI strips with `Warning: unknown field: role` and ships an empty body). The CLI also doesn't yet expose convenience flags like `--spec.identity-type` ([cmd/workspace/postgres/postgres.go](https://github.com/databricks/cli/blob/main/cmd/workspace/postgres/postgres.go) marks `spec` as TODO), so you must hand-craft the JSON. Get SP client ID: `databricks apps get --profile ` → `service_principal_client_id` field. diff --git a/skills/databricks-lakebase/references/connectivity.md b/skills/databricks-lakebase/references/connectivity.md index 6ee62e8..29f22ff 100644 --- a/skills/databricks-lakebase/references/connectivity.md +++ b/skills/databricks-lakebase/references/connectivity.md @@ -140,6 +140,42 @@ For production apps, combine with Pattern 2's token refresh loop and SQLAlchemy - **Handle scale-to-zero reconnection** — first connection after idle may take ~100ms; implement retry - **psycopg2 or psycopg3** — both work; psycopg3 recommended for new development (better async, pooling) +## DNS Resolution (macOS) + +Python's `socket.getaddrinfo()` can fail with long Lakebase hostnames on macOS. Workaround: resolve via `dig`, then pass the IP through `hostaddr` while keeping `host` for TLS SNI. + +```bash +# Resolve the Lakebase hostname to an IP +dig +short +``` + +```python +import subprocess + +def resolve_host(hostname: str) -> str: + try: + result = subprocess.run( + ["dig", "+short", hostname], capture_output=True, text=True, check=False + ) + except FileNotFoundError as e: + raise RuntimeError("'dig' is not installed; install it (e.g. `apt-get install dnsutils`) or use socket.getaddrinfo() instead") from e + lines = result.stdout.strip().splitlines() + if not lines: + raise RuntimeError(f"DNS resolution failed for {hostname}") + return lines[0] + +ip = resolve_host(endpoint_host) + +conn = psycopg.connect( + host=endpoint_host, # kept for TLS SNI verification + hostaddr=ip, # bypasses getaddrinfo() + dbname="databricks_postgres", + user=username, + password=token, + sslmode="require", +) +``` + ## Data API PostgREST-compatible HTTP API for CRUD operations on Postgres tables. **Autoscaling only.** From eb8ee350009553276ffaf0bb7924eab5ec75756e Mon Sep 17 00:00:00 2001 From: James Broadhead Date: Tue, 28 Apr 2026 17:34:15 +0000 Subject: [PATCH 5/5] docs(appkit/lakebase): clarify PGUSER must match local credentials MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The local development snippet hardcoded PGUSER=, but AppKit's local dev server authenticates as the developer's Databricks user by default — so the OAuth token (PGPASSWORD) was for the developer while PGUSER named the SP, which Postgres rejects with "password authentication failed for user ''". Replace the hardcoded value with a note that explains both paths: - Default (personal profile): PGUSER is your Databricks username/email. - Testing the deployed flow locally: export DATABRICKS_CLIENT_ID and DATABRICKS_CLIENT_SECRET so the dev server authenticates as the SP, then PGUSER= matches. Co-authored-by: Isaac --- skills/databricks-apps/references/appkit/lakebase.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/skills/databricks-apps/references/appkit/lakebase.md b/skills/databricks-apps/references/appkit/lakebase.md index 70ba84d..28caac0 100644 --- a/skills/databricks-apps/references/appkit/lakebase.md +++ b/skills/databricks-apps/references/appkit/lakebase.md @@ -263,11 +263,18 @@ Then create `server/.env` with the values from the endpoint response: PGHOST= PGPORT=5432 PGDATABASE= -PGUSER= +PGUSER= PGSSLMODE=require LAKEBASE_ENDPOINT=projects//branches//endpoints/ ``` +> **`PGUSER` must match the credentials the AppKit dev server uses.** The Postgres role in `PGUSER` has to correspond to the principal that produced `PGPASSWORD` (the OAuth token). +> +> - **Default (personal Databricks profile):** AppKit's local server authenticates as your Databricks user, so `PGUSER` is your Databricks username/email. Tables created locally will be owned by your user, not the SP — that's why the deploy-first workflow exists. +> - **Testing the deployed flow locally:** export `DATABRICKS_CLIENT_ID=` and `DATABRICKS_CLIENT_SECRET=...` so the dev server authenticates as the SP. Then `PGUSER=` matches. +> +> If `PGUSER` and the OAuth token disagree, Postgres rejects the connection with `password authentication failed for user ''`. + Load `server/.env` in your dev server (e.g. via `dotenv` or `node --env-file=server/.env`). Never commit `.env` files — add `server/.env` to `.gitignore`. ## Troubleshooting