databricks-solutions · jamesbroadhead · May 12, 2026 · May 12, 2026 · May 12, 2026
diff --git a/.github/workflows/sync-skills-from-das.yml b/.github/workflows/sync-skills-from-das.yml
@@ -0,0 +1,49 @@
+name: Sync skills from databricks-agent-skills
+
+on:
+  workflow_dispatch:
+  schedule:
+    # Mondays 06:00 UTC
+    - cron: "0 6 * * 1"
+
+permissions:
+  contents: write
+  pull-requests: write
+
+jobs:
+  sync:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          token: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Configure git identity
+        run: |
+          git config user.name "github-actions[bot]"
+          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
+
+      - name: Pull experimental skills from upstream subtree
+        run: |
+          git subtree pull \
+            --prefix=databricks-skills/imported \
+            https://github.com/databricks/databricks-agent-skills \
+            experimental-only \
+            --squash \
+            -m "chore(skills): sync from databricks-agent-skills/experimental"
+
+      - name: Open PR if there is drift
+        uses: peter-evans/create-pull-request@v6
+        with:
+          branch: sync/databricks-agent-skills
+          title: "chore(skills): sync from databricks-agent-skills/experimental"
+          body: |
+            Automated sync from
+            [databricks/databricks-agent-skills](https://github.com/databricks/databricks-agent-skills)
+            `experimental-only` branch (`git subtree pull`).
+
+            See [`databricks-skills/SYNC.md`](databricks-skills/SYNC.md) for the
+            mechanism. Edits to imported skills should be made upstream; this
+            PR will overwrite them on next sync.
+          delete-branch: true
diff --git a/databricks-skills/README.md b/databricks-skills/README.md
@@ -2,6 +2,11 @@
 
 Skills that teach Claude Code how to work effectively with Databricks - providing patterns, best practices, and code examples that work with Databricks MCP tools.
 
+> **Note**: the [`imported/`](./imported/) subdirectory is synced from
+> [`databricks/databricks-agent-skills/experimental/`](https://github.com/databricks/databricks-agent-skills/tree/main/experimental).
+> See [SYNC.md](./SYNC.md) for the mechanism. **Do not edit files under
+> `imported/`** — open PRs against the upstream repo instead.
+
 ## Installation
 
 Run from your **project root** (the directory where you want `.claude/skills` created).

diff --git a/databricks-skills/SYNC.md b/databricks-skills/SYNC.md
@@ -0,0 +1,70 @@
+# Sync from databricks-agent-skills
+
+The [`databricks-skills/imported/`](./imported/) directory is kept in sync with
+[`databricks/databricks-agent-skills`](https://github.com/databricks/databricks-agent-skills)
+`experimental/` via `git subtree`.
+
+## Mechanism
+
+- **`databricks-agent-skills`** publishes a branch named `experimental-only`
+  whose root tree is the contents of `experimental/`. The d-a-s repo runs
+  `git subtree split --prefix=experimental --branch=experimental-only` after
+  each push to `main` (workflow on that side).
+- **This repo** runs
+  [`.github/workflows/sync-skills-from-das.yml`](../.github/workflows/sync-skills-from-das.yml)
+  weekly (and on manual dispatch). It calls `git subtree pull` from
+  `experimental-only` into `databricks-skills/imported/` and opens a PR if
+  there is drift.
+
+## Do not edit `imported/` here
+
+Files under `databricks-skills/imported/` are upstream-owned. Local edits will
+be overwritten on the next sync. To change an imported skill, open a PR
+against [`databricks/databricks-agent-skills`](https://github.com/databricks/databricks-agent-skills)
+under `experimental/<skill>/`. The next sync will bring your change back here.
+
+Skills under `databricks-skills/*` that are **not** in `imported/` (the legacy
+top-level skills) remain a-d-k-owned and can be edited freely. Over time these
+should also migrate upstream — see the PR introducing the sync mechanism.
+
+## Manual sync (if you need it sooner than the cron)
+
+```bash
+git subtree pull \
+  --prefix=databricks-skills/imported \
+  https://github.com/databricks/databricks-agent-skills \
+  experimental-only --squash
+```
+
+Then push the result via PR as usual.
+
+## First-time setup (already done in PR introducing this file)
+
+```bash
+git subtree add \
+  --prefix=databricks-skills/imported \
+  https://github.com/databricks/databricks-agent-skills \
+  experimental-only --squash
+```
+
+## Why `git subtree` over alternatives
+
+- **vs `git submodule`** — subtree keeps files in the working tree, so
+  `install_skills.sh` and end users see the skills directly without
+  `git submodule update`. Submodules also can't reference a subdirectory of
+  the target repo, so wouldn't work for this case anyway.
+- **vs `rsync` / `cp` in a workflow** — subtree records each sync as a
+  squashed merge commit referencing the upstream SHA. `git log --grep
+  "Squashed 'databricks-skills/imported/'"` shows the full sync history;
+  you can `git blame` an imported skill back to its upstream commit.
+- **vs hard fork** — subtree pulls are automated and visible in CI; a fork
+  would diverge silently.
+
+## Known limitations
+
+- `git subtree pull --squash` produces a merge commit and a squash commit on
+  every sync, even when there is no drift. The auto-PR step short-circuits
+  in that case so no noise reaches reviewers.
+- The `experimental-only` branch on d-a-s is force-pushed by the split
+  workflow. Subtree handles this because each pull is squashed against the
+  prior squash commit; there is no shared linear history to preserve.
diff --git a/databricks-skills/imported/README.md b/databricks-skills/imported/README.md
@@ -0,0 +1,89 @@
+> ⚠️ **Experimental — best-effort, not officially supported**
+>
+> The skills in this directory are imported from
+> [databricks-solutions/ai-dev-kit](https://github.com/databricks-solutions/ai-dev-kit)
+> on a best-effort basis. They may be useful, but they are **not officially
+> supported** as part of `databricks-agent-skills`:
+>
+> - They do not follow the same review / quality bar as the skills in
+>   [`../skills/`](../skills/).
+> - They may be out of date relative to upstream `ai-dev-kit`.
+> - They may overlap or conflict with the stable skills (e.g.
+>   `databricks-jobs`, `databricks-model-serving` exist in both directories).
+> - They are not installed by `databricks experimental aitools skills install`
+>   by default — you have to opt in (see the root README).
+>
+> File issues against this directory in this repo; do not file issues against
+> `ai-dev-kit` for skills installed via `databricks-agent-skills`.
+
+---
+
+# Databricks Skills for Claude Code
+
+Skills that teach Claude Code how to work effectively with Databricks - providing patterns, best practices, and code examples that work with Databricks MCP tools.
+
+## Installation
+
+These experimental skills are **not** installed by default. To install them via the Databricks CLI:
+
+```bash
+# Install all experimental skills at once
+databricks experimental aitools skills install --experimental
+
+# Install a single experimental skill by name
+databricks experimental aitools skills install databricks-iceberg
+```
+
+See the root [README](../README.md) for details on the stable install path.
+
+## Available Skills
+
+### 🤖 AI & Agents
+- **databricks-ai-functions** - Built-in AI Functions (ai_classify, ai_extract, ai_summarize, ai_query, ai_forecast, ai_parse_document, and more) with SQL and PySpark patterns, function selection guidance, document processing pipelines, and custom RAG (parse → chunk → index → query)
+- **databricks-agent-bricks** - Knowledge Assistants, Genie Spaces, Supervisor Agents
+- **databricks-genie** - Genie Spaces: create, curate, and query via Conversation API
+- **databricks-mlflow-evaluation** - End-to-end agent evaluation workflow
+- **databricks-unstructured-pdf-generation** - Generate synthetic PDFs for RAG
+- **databricks-vector-search** - Vector similarity search for RAG and semantic search
+
+### 📊 Analytics & Dashboards
+- **databricks-aibi-dashboards** - Databricks AI/BI dashboards (with SQL validation workflow)
+- **databricks-metric-views** - Metric Views for governed metrics
+- **databricks-unity-catalog** - System tables for lineage, audit, billing
+
+### 🔧 Data Engineering
+- **databricks-dbsql** - Databricks SQL warehouse patterns
+- **databricks-iceberg** - Apache Iceberg tables (Managed/Foreign), UniForm, Iceberg REST Catalog, Iceberg Clients Interoperability
+- **databricks-spark-declarative-pipelines** - SDP (formerly DLT) in SQL/Python
+- **databricks-spark-structured-streaming** - Spark Structured Streaming patterns
+- **databricks-jobs** - Multi-task workflows, triggers, schedules *(also available as stable skill)*
+- **databricks-synthetic-data-gen** - Realistic test data with Faker
+- **databricks-zerobus-ingest** - Zerobus ingest patterns
+- **spark-python-data-source** - Python data sources for Spark
+
+### 🚀 Development & Deployment
+- **databricks-bundles** - DABs for multi-environment deployments
+- **databricks-apps-python** - Python web apps (Dash, Streamlit, Flask) with foundation model integration
+- **databricks-python-sdk** - Python SDK, Connect, CLI, REST API
+- **databricks-config** - Profile authentication setup
+- **databricks-execution-compute** - Execute on Databricks compute
+- **databricks-lakebase-autoscale** - Autoscaling for Lakebase
+- **databricks-lakebase-provisioned** - Managed PostgreSQL for OLTP workloads
+
+### 📚 Reference
+- **databricks-docs** - Documentation index via llms.txt
+
+## Provenance & sync model
+
+These skills are imported as a snapshot from
+[`databricks-solutions/ai-dev-kit/databricks-skills/`](https://github.com/databricks-solutions/ai-dev-kit/tree/main/databricks-skills).
+
+**Transition phase (until `ai-dev-kit` skills are locked):**
+- Source of truth is **upstream `ai-dev-kit`**. New work and bug fixes go there.
+- This directory receives **periodic manual re-syncs** — someone opens a PR
+  to bring drift from upstream into `experimental/`.
+
+**Post-lock (after `ai-dev-kit` skill contributions are stopped):**
+- Source of truth is **this repo**. New work and bug fixes go directly to
+  `experimental/<skill>/`.
+- `ai-dev-kit/databricks-skills/` becomes read-only and points here.