Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions .github/workflows/sync-skills-from-das.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Sync skills from databricks-agent-skills

on:
workflow_dispatch:
schedule:
# Mondays 06:00 UTC
- cron: "0 6 * * 1"

permissions:
contents: write
pull-requests: write

jobs:
sync:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}

- name: Configure git identity
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"

- name: Pull experimental skills from upstream subtree
run: |
git subtree pull \
--prefix=databricks-skills/imported \
https://github.com/databricks/databricks-agent-skills \
experimental-only \
--squash \
-m "chore(skills): sync from databricks-agent-skills/experimental"

- name: Open PR if there is drift
uses: peter-evans/create-pull-request@v6
with:
branch: sync/databricks-agent-skills
title: "chore(skills): sync from databricks-agent-skills/experimental"
body: |
Automated sync from
[databricks/databricks-agent-skills](https://github.com/databricks/databricks-agent-skills)
`experimental-only` branch (`git subtree pull`).

See [`databricks-skills/SYNC.md`](databricks-skills/SYNC.md) for the
mechanism. Edits to imported skills should be made upstream; this
PR will overwrite them on next sync.
delete-branch: true
5 changes: 5 additions & 0 deletions databricks-skills/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

Skills that teach Claude Code how to work effectively with Databricks - providing patterns, best practices, and code examples that work with Databricks MCP tools.

> **Note**: the [`imported/`](./imported/) subdirectory is synced from
> [`databricks/databricks-agent-skills/experimental/`](https://github.com/databricks/databricks-agent-skills/tree/main/experimental).
> See [SYNC.md](./SYNC.md) for the mechanism. **Do not edit files under
> `imported/`** — open PRs against the upstream repo instead.

## Installation

Run from your **project root** (the directory where you want `.claude/skills` created).
Expand Down
70 changes: 70 additions & 0 deletions databricks-skills/SYNC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Sync from databricks-agent-skills

The [`databricks-skills/imported/`](./imported/) directory is kept in sync with
[`databricks/databricks-agent-skills`](https://github.com/databricks/databricks-agent-skills)
`experimental/` via `git subtree`.

## Mechanism

- **`databricks-agent-skills`** publishes a branch named `experimental-only`
whose root tree is the contents of `experimental/`. The d-a-s repo runs
`git subtree split --prefix=experimental --branch=experimental-only` after
each push to `main` (workflow on that side).
- **This repo** runs
[`.github/workflows/sync-skills-from-das.yml`](../.github/workflows/sync-skills-from-das.yml)
weekly (and on manual dispatch). It calls `git subtree pull` from
`experimental-only` into `databricks-skills/imported/` and opens a PR if
there is drift.

## Do not edit `imported/` here

Files under `databricks-skills/imported/` are upstream-owned. Local edits will
be overwritten on the next sync. To change an imported skill, open a PR
against [`databricks/databricks-agent-skills`](https://github.com/databricks/databricks-agent-skills)
under `experimental/<skill>/`. The next sync will bring your change back here.

Skills under `databricks-skills/*` that are **not** in `imported/` (the legacy
top-level skills) remain a-d-k-owned and can be edited freely. Over time these
should also migrate upstream — see the PR introducing the sync mechanism.

## Manual sync (if you need it sooner than the cron)

```bash
git subtree pull \
--prefix=databricks-skills/imported \
https://github.com/databricks/databricks-agent-skills \
experimental-only --squash
```

Then push the result via PR as usual.

## First-time setup (already done in PR introducing this file)

```bash
git subtree add \
--prefix=databricks-skills/imported \
https://github.com/databricks/databricks-agent-skills \
experimental-only --squash
```

## Why `git subtree` over alternatives

- **vs `git submodule`** — subtree keeps files in the working tree, so
`install_skills.sh` and end users see the skills directly without
`git submodule update`. Submodules also can't reference a subdirectory of
the target repo, so wouldn't work for this case anyway.
- **vs `rsync` / `cp` in a workflow** — subtree records each sync as a
squashed merge commit referencing the upstream SHA. `git log --grep
"Squashed 'databricks-skills/imported/'"` shows the full sync history;
you can `git blame` an imported skill back to its upstream commit.
- **vs hard fork** — subtree pulls are automated and visible in CI; a fork
would diverge silently.

## Known limitations

- `git subtree pull --squash` produces a merge commit and a squash commit on
every sync, even when there is no drift. The auto-PR step short-circuits
in that case so no noise reaches reviewers.
- The `experimental-only` branch on d-a-s is force-pushed by the split
workflow. Subtree handles this because each pull is squashed against the
prior squash commit; there is no shared linear history to preserve.
89 changes: 89 additions & 0 deletions databricks-skills/imported/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
> ⚠️ **Experimental — best-effort, not officially supported**
>
> The skills in this directory are imported from
> [databricks-solutions/ai-dev-kit](https://github.com/databricks-solutions/ai-dev-kit)
> on a best-effort basis. They may be useful, but they are **not officially
> supported** as part of `databricks-agent-skills`:
>
> - They do not follow the same review / quality bar as the skills in
> [`../skills/`](../skills/).
> - They may be out of date relative to upstream `ai-dev-kit`.
> - They may overlap or conflict with the stable skills (e.g.
> `databricks-jobs`, `databricks-model-serving` exist in both directories).
> - They are not installed by `databricks experimental aitools skills install`
> by default — you have to opt in (see the root README).
>
> File issues against this directory in this repo; do not file issues against
> `ai-dev-kit` for skills installed via `databricks-agent-skills`.

---

# Databricks Skills for Claude Code

Skills that teach Claude Code how to work effectively with Databricks - providing patterns, best practices, and code examples that work with Databricks MCP tools.

## Installation

These experimental skills are **not** installed by default. To install them via the Databricks CLI:

```bash
# Install all experimental skills at once
databricks experimental aitools skills install --experimental

# Install a single experimental skill by name
databricks experimental aitools skills install databricks-iceberg
```

See the root [README](../README.md) for details on the stable install path.

## Available Skills

### 🤖 AI & Agents
- **databricks-ai-functions** - Built-in AI Functions (ai_classify, ai_extract, ai_summarize, ai_query, ai_forecast, ai_parse_document, and more) with SQL and PySpark patterns, function selection guidance, document processing pipelines, and custom RAG (parse → chunk → index → query)
- **databricks-agent-bricks** - Knowledge Assistants, Genie Spaces, Supervisor Agents
- **databricks-genie** - Genie Spaces: create, curate, and query via Conversation API
- **databricks-mlflow-evaluation** - End-to-end agent evaluation workflow
- **databricks-unstructured-pdf-generation** - Generate synthetic PDFs for RAG
- **databricks-vector-search** - Vector similarity search for RAG and semantic search

### 📊 Analytics & Dashboards
- **databricks-aibi-dashboards** - Databricks AI/BI dashboards (with SQL validation workflow)
- **databricks-metric-views** - Metric Views for governed metrics
- **databricks-unity-catalog** - System tables for lineage, audit, billing

### 🔧 Data Engineering
- **databricks-dbsql** - Databricks SQL warehouse patterns
- **databricks-iceberg** - Apache Iceberg tables (Managed/Foreign), UniForm, Iceberg REST Catalog, Iceberg Clients Interoperability
- **databricks-spark-declarative-pipelines** - SDP (formerly DLT) in SQL/Python
- **databricks-spark-structured-streaming** - Spark Structured Streaming patterns
- **databricks-jobs** - Multi-task workflows, triggers, schedules *(also available as stable skill)*
- **databricks-synthetic-data-gen** - Realistic test data with Faker
- **databricks-zerobus-ingest** - Zerobus ingest patterns
- **spark-python-data-source** - Python data sources for Spark

### 🚀 Development & Deployment
- **databricks-bundles** - DABs for multi-environment deployments
- **databricks-apps-python** - Python web apps (Dash, Streamlit, Flask) with foundation model integration
- **databricks-python-sdk** - Python SDK, Connect, CLI, REST API
- **databricks-config** - Profile authentication setup
- **databricks-execution-compute** - Execute on Databricks compute
- **databricks-lakebase-autoscale** - Autoscaling for Lakebase
- **databricks-lakebase-provisioned** - Managed PostgreSQL for OLTP workloads

### 📚 Reference
- **databricks-docs** - Documentation index via llms.txt

## Provenance & sync model

These skills are imported as a snapshot from
[`databricks-solutions/ai-dev-kit/databricks-skills/`](https://github.com/databricks-solutions/ai-dev-kit/tree/main/databricks-skills).

**Transition phase (until `ai-dev-kit` skills are locked):**
- Source of truth is **upstream `ai-dev-kit`**. New work and bug fixes go there.
- This directory receives **periodic manual re-syncs** — someone opens a PR
to bring drift from upstream into `experimental/`.

**Post-lock (after `ai-dev-kit` skill contributions are stopped):**
- Source of truth is **this repo**. New work and bug fixes go directly to
`experimental/<skill>/`.
- `ai-dev-kit/databricks-skills/` becomes read-only and points here.
Loading