Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions .github/ISSUE_TEMPLATE/migration-feedback.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
name: Migration Skill Feedback
about: Report a bug, suggest a pattern, or submit a failure report for the serverless migration skill
title: "[migration-skill] "
labels: migration-skill
assignees: ''
---

<!--
Before submitting, please review the PII guidelines in README.md:
- Remove customer code — use minimal reproducible examples
- Remove credentials, workspace URLs, catalog names, user emails
- Remove customer-identifying information
-->

## Category

<!-- Check one -->

- [ ] Pattern not detected (the skill missed a classic-compute construct)
- [ ] Wrong fix suggested (the fix didn't work or was incorrect)
- [ ] Migration succeeded but output differs (code runs, data is different)
- [ ] Documentation unclear
- [ ] Failure report (attach JSON from `~/.databricks-migration-skill/reports/`)
- [ ] New pattern suggestion

## Pre-submission checklist

- [ ] I have removed all customer code and replaced with a minimal reproducible example
- [ ] I have removed all credentials, tokens, and secret scope names
- [ ] I have removed all workspace URLs, catalog names, and user emails
- [ ] I have removed any customer-identifying information

## Environment

- Skill version: <!-- e.g., v0.2.0 (check SKILL.md metadata.version) -->
- Agent: <!-- e.g., Claude Code, Cursor, custom -->
- Model: <!-- e.g., claude-sonnet-4-6 -->
- Databricks Runtime of source workload: <!-- e.g., 14.3 LTS -->

## Description

<!--
What happened? What did you expect to happen?
Include a minimal reproducible example if relevant.
-->

## Minimal reproducible example

<!-- Replace with a 5-10 line snippet that shows the pattern -->
```python
# Classic compute code (minimal example, no customer data)
```

## Expected migration

<!-- What the skill should have done -->

## Actual migration

<!-- What the skill actually did, if anything -->

## Failure report (optional)

<!--
If you're attaching a failure report JSON:
1. Open it and confirm no PII is present
2. Paste the contents below in a code block, OR upload as a file attachment
-->

<details>
<summary>Failure report JSON</summary>

```json
<!-- paste report contents here after reviewing for PII -->
```

</details>

## Additional context

<!-- Any other context, screenshots, or information -->
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Run this command in chat:
## Available Skills

- **databricks-apps** - Build full-stack TypeScript apps on Databricks using AppKit
- **databricks-serverless-migration** - Migrate classic-compute workloads (notebooks, jobs, pipelines) to serverless compute. Detects compatibility blockers, provides concrete fixes, and produces an anonymized failure report you can file as a GitHub issue when migration cannot complete. Also installable inside a Databricks workspace via Genie Code Agent mode — see [the install reference](skills/databricks-serverless-migration/references/install-in-databricks-genie-code.md).

## Structure

Expand Down
6 changes: 4 additions & 2 deletions manifest.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"version": "2",
"updated_at": "2026-04-30T11:02:41Z",
"updated_at": "2026-05-12T07:54:37Z",
"skills": {
"databricks-apps": {
"version": "0.1.1",
Expand Down Expand Up @@ -152,7 +152,7 @@
"version": "0.1.0",
"description": "Migrate Databricks workloads from classic compute to serverless compute, including compatibility checks and concrete fixes",
"experimental": false,
"updated_at": "2026-04-24T15:10:23Z",
"updated_at": "2026-05-12T07:54:37Z",
"files": [
"SKILL.md",
"agents/openai.yaml",
Expand All @@ -161,6 +161,8 @@
"references/code-patterns.md",
"references/compatibility-checks.md",
"references/configuration-guide.md",
"references/failure-reporting.md",
"references/install-in-databricks-genie-code.md",
"references/networking-and-security.md",
"references/streaming-migration.md"
]
Expand Down
125 changes: 97 additions & 28 deletions skills/databricks-serverless-migration/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,15 @@ Analyze existing Databricks code for serverless compute compatibility and guide
- Troubleshooting serverless-specific errors after migration
- Choosing between Performance-Optimized and Standard mode

## Where to Run This Skill

This skill is published as an Agent Skill (agentskills.io) and runs in any compatible client:

- **Claude Code, Cursor, or any agentskills.io client on your laptop** — the default. Install via `databricks experimental aitools install` or follow the per-client docs.
- **Inside a Databricks workspace via Genie Code Agent mode** — drop the skill into `/Workspace/Users/<you>/.assistant/skills/databricks-serverless-migration/` (per-user) or `/Workspace/.assistant/skills/databricks-serverless-migration/` (workspace-wide, admin only). See [Install in Databricks Genie Code](references/install-in-databricks-genie-code.md) for the three install methods and the **important serverless-compute caveat** (the Databricks CLI isn't pre-installed on serverless, so some deploy steps need adjustment).

If you finish a migration analysis for a user who's currently running you from a laptop client, mention the Genie Code option once at the end — many users prefer iterating on migrations inside the workspace where the workload lives.

## Understanding Migration Blockers

Migration blockers fall into three categories. Focus your effort on category 2 — that's where this skill helps most.
Expand Down Expand Up @@ -545,29 +554,36 @@ See [Configuration Guide](references/configuration-guide.md) for the full error

## Failure Reporting Protocol

When migration fails irrecoverably, generate a structured failure report to help improve the skill. This applies when:
When migration fails or hits an unmigratable pattern, generate a structured failure report and offer the user a one-step path to file it as a GitHub issue. This feedback loop is how the skill learns — without it, gaps go undiscovered.

See [Failure Reporting](references/failure-reporting.md) for the full redaction checklist, URL-encoding recipe, and example pre-filled link.

### Decision tree: when to offer to file an issue

- All retry attempts are exhausted (typically 5)
- An unknown pattern is encountered that isn't in the compatibility checks
- A fix was applied but didn't resolve the underlying issue
- The workload hits a Category 3 blocker the user wasn't aware of
**Offer to file a GitHub issue any time the workload cannot be fully migrated to serverless as-is.** Reports from "known" patterns (R, Scala, custom JAR data sources, JVM access, third-party connectors) are just as valuable as reports from unknown patterns — they tell maintainers which gaps users hit most often, which drives prioritization.

### When to generate a report
Concretely, ALWAYS offer if **any** of these is true:

Generate a report at the end of a migration attempt if **any** of:
- `retry_count >= max_retries` and final status is FAILED
- A pattern was detected but no fix is available in the skill
- The user explicitly requests a failure report (`/migration-report`)
1. **Workload contains any Category 3 (classic-only) blocker** — R, Scala notebook cells, custom JAR data sources, JVM/Py4J access, third-party connectors without serverless equivalents, native binary dependencies, etc. The fact that the pattern is "documented as Cat 3" is not a reason to skip the offer.
2. **Retries exhausted** — `retry_count >= max_retries` (typically 5) and final status is FAILED
3. **Unknown pattern** — a classic-compute construct was detected that isn't in the skill's catalog
4. **Fix didn't resolve** — a known fix was applied but the workload still fails on serverless
5. **Explicit request** — the user invokes `/migration-report`

### How to generate
**Do NOT offer to file** only when:

- The migration succeeded fully (even after retries), or
- The workload is already serverless-compatible and required no changes.

### How to generate the report

Write a JSON file to `~/.databricks-migration-skill/reports/failure-<ISO-timestamp>.json`. Create the directory if it doesn't exist.

**Schema** (strictly follow — no free-text code or identifiers):

```json
{
"report_version": "1.0",
"report_version": "1.1",
"report_id": "<uuid-v4>",
"skill_version": "<from SKILL.md frontmatter metadata.version>",
"timestamp": "<ISO 8601 UTC>",
Expand All @@ -578,7 +594,7 @@ Write a JSON file to `~/.databricks-migration-skill/reports/failure-<ISO-timesta
"attempted_fixes": [
{"pattern_id": "rdd_parallelize", "fix_applied": "<fix_id>", "attempt_number": 1, "outcome": "failed"}
],
"final_error_category": "unknown_api | missing_library | data_access | permission | custom_data_source | other",
"final_error_category": "unknown_api | missing_library | data_access | permission | custom_data_source | jvm_access | unsupported_language | other",
"final_error_signature": "<SHA256 of top 3 stack frames, NOT the frames themselves>",
"retry_count": 5,
"total_duration_seconds": 245,
Expand All @@ -594,34 +610,85 @@ Write a JSON file to `~/.databricks-migration-skill/reports/failure-<ISO-timesta

### What the report MUST NOT contain

This is a hard requirement — the report must be safe to share publicly on GitHub Issues:
Hard requirement — the report must be safe to share publicly on GitHub Issues:

- **No code content** — only pattern IDs from this skill's catalog (e.g., `rdd_parallelize`), never actual code snippets
- **No file paths** — no notebook names, directory paths, or workspace URLs
- **No code content** — pattern IDs only (e.g., `rdd_parallelize`), never code snippets, function bodies, or even single-line examples
- **No file paths** — no notebook names, directory paths, workspace URLs, or DBFS paths
- **No error message text** — only the error category enum and a hashed signature
- **No identifiers** — no table names, column names, catalog names, schema names, user emails, workspace IDs, or customer names
- **No credentials** — no secret scope names, API keys, or connection strings
- **No data descriptions** — no column value samples, row counts tied to specific tables, or data shape details beyond the `notebook_characteristics` fields
- **No identifiers** — no table names, column names, catalog names, schema names, secret scope names, user emails, workspace IDs, or account IDs
- **No internal Databricks references** — no Databricks employee names, internal codenames (e.g., product code names not in public docs), `go/` links, Confluence page IDs, Google Doc IDs, Slack user or channel IDs (`U…`, `C…`), PROD-* / SEV-* / SC-* ticket numbers
- **No customer references** — no company names, product names of customer systems, or anything that would identify the workspace's owning organization
- **No credentials** — no tokens, API keys, connection strings, JDBC URLs, or service principal IDs
- **No data descriptions** — no column value samples, row counts tied to specific tables, or schema fingerprints beyond the `notebook_characteristics` fields

### Anonymization safety pass

Before writing the report, scan every string field against this pattern checklist. If any pattern matches, **drop the offending field** (do not redact partially — empty string is safer than risking leakage):

| Pattern | What to scrub |
|---------|---------------|
| `[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}` | Email addresses |
| `dbfs:/`, `/dbfs/`, `s3://`, `abfss://`, `gs://`, `wasbs://` | Cloud storage paths |
| `https?://[a-z0-9-]+\.cloud\.databricks\.com`, `https?://adb-\d+\.\d+\.azuredatabricks\.net` | Workspace URLs |
| `U[A-Z0-9]{8,}`, `C[A-Z0-9]{8,}` | Slack user / channel IDs |
| `\bgo/[a-z0-9-]+\b` | go/ links |
| `\b(PROD|SEV|SC|JIRA)-\d+\b` | Internal ticket IDs |
| `[0-9a-f]{20,}` (heuristic) | Likely doc/file/workspace IDs |
| Catalog/schema/table name literals from the analyzed notebook | Drop and replace with `"<redacted>"` |

### After generating the report
The `notebook_characteristics` fields are the only safe surface for workload metadata. Do not add new fields without expanding this checklist.

Tell the user:
### After generating the report — REQUIRED output template

**This is not optional.** When the decision tree above says "offer to file", you MUST produce all three of these in your final response to the user, in order:

1. The local report file path (literally include the string `~/.databricks-migration-skill/reports/failure-<timestamp>.json`, with `<timestamp>` filled in).
2. **Option A** — a complete pre-filled `https://github.com/databricks/databricks-agent-skills/issues/new?template=migration-feedback.md&title=<…>&body=<…>` URL, both parameters URL-encoded.
3. **Option B** — the literal `gh issue create --repo databricks/databricks-agent-skills …` command, body-file pointing at the local report.

If you write the JSON inline in your response but skip the URL and the `gh` command, the protocol is **not satisfied**. The whole point is to make filing one click.

Use this exact wrap-up template, replacing `<…>` placeholders:

```
Migration failed after <N> attempts. A failure report has been generated at:
Migration could not complete. A failure report has been generated at:

~/.databricks-migration-skill/reports/failure-<timestamp>.json

This report contains anonymized diagnostic data (detected patterns, error categories, retry count) and no code content or PII. You can:
The report contains anonymized diagnostic data (detected pattern IDs, error
category, retry count, notebook characteristics) and no code content or PII.
Submission is optional and opt-in.

To help improve this skill, file the report as a GitHub issue:

1. Review the JSON to confirm no sensitive information is present
2. Share it via GitHub Issue to help improve the skill:
https://github.com/databricks/databricks-agent-skills/issues/new?template=migration-feedback.md
Option A — One-click in browser (pre-filled):
<PREFILLED_ISSUE_URL>

Submission is optional and opt-in. We use reports to prioritize new patterns and fix detection gaps.
Option B — From the terminal (if you have the GitHub CLI installed):
gh issue create \
--repo databricks/databricks-agent-skills \
--title "<TITLE>" \
--body-file ~/.databricks-migration-skill/reports/failure-<timestamp>.json \
--label migration-skill

Before submitting, please open the JSON and confirm nothing sensitive
slipped through. We never transmit reports automatically.
```

**Never transmit the report automatically.** The user owns their data and must review before sharing.
Build `<PREFILLED_ISSUE_URL>` like this:

1. **Title**: `[migration-skill] <final_error_category> in <failure_phase> phase`
Example: `[migration-skill] custom_data_source in migrate phase`
2. **Body**: the issue template's markdown skeleton (Category, Environment, Description, Failure report JSON fenced in ` ```json `) with the report JSON inlined.
3. **URL-encode** both title and body (`%20` for spaces, `%23` for `#`, `%0A` for newline, etc.).
4. **Final URL**:
`https://github.com/databricks/databricks-agent-skills/issues/new?template=migration-feedback.md&title=<URL-encoded title>&body=<URL-encoded body>`

If your runtime cannot actually write the file (sandboxed, no filesystem write), still show the path the file WOULD be at and produce Options A and B. The user can write the JSON to disk themselves.

The full recipe with a worked example is in [Failure Reporting](references/failure-reporting.md).

**Never transmit the report automatically.** The user owns their data and must review before sharing. If the user declines, do not press them — log the local report path and move on.

## Reference Guides

Expand All @@ -632,6 +699,8 @@ For detailed workarounds and code examples beyond the quick fixes above:
- [Networking and Security](references/networking-and-security.md) — VPC peering to NCCs, Private Link, firewall setup
- [Code Patterns](references/code-patterns.md) — Complete before/after code examples for every migration pattern
- [Configuration Guide](references/configuration-guide.md) — Supported Spark configs, Environments setup, budget policies
- [Failure Reporting](references/failure-reporting.md) — Redaction checklist + pre-filled GitHub issue URL recipe (for when migration cannot complete)
- [Install in Databricks Genie Code](references/install-in-databricks-genie-code.md) — Run this skill inside a Databricks workspace

## Documentation

Expand Down
Loading
Loading