diff --git a/.claude/skills/use-supervisor-api/SKILL.md b/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..e5882021 --- /dev/null +++ b/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run {{BUNDLE_NAME}} # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/.gitignore b/.gitignore index 8ee7fb7f..6e6d78be 100644 --- a/.gitignore +++ b/.gitignore @@ -189,6 +189,7 @@ mlflow.db !.claude/skills/agent-langgraph-memory/ !.claude/skills/agent-openai-memory/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ !.claude/skills/enable-feedback/ !.claude/AGENTS.md !.claude/CLAUDE.md \ No newline at end of file diff --git a/.scripts/sync-skills.py b/.scripts/sync-skills.py index d5fc82fd..d0f55cb4 100755 --- a/.scripts/sync-skills.py +++ b/.scripts/sync-skills.py @@ -55,6 +55,9 @@ def sync_template(template: str, config: dict): # Deploy skill (with substitution) copy_skill(SOURCE / "deploy", dest / "deploy", subs) + # Supervisor API skill (with substitution for bundle name in deploy command) + copy_skill(SOURCE / "use-supervisor-api", dest / "use-supervisor-api", subs) + # SDK-specific skills (with substitution for bundle name references) if isinstance(sdk, list): # Multiple SDKs: copy skills for each, keeping SDK suffix in name diff --git a/agent-langgraph-long-term-memory/.claude/skills/use-supervisor-api/SKILL.md b/agent-langgraph-long-term-memory/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..db9d8776 --- /dev/null +++ b/agent-langgraph-long-term-memory/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_langgraph_long_term_memory # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-langgraph-long-term-memory/.gitignore b/agent-langgraph-long-term-memory/.gitignore index ec2a577a..3a1fbcce 100644 --- a/agent-langgraph-long-term-memory/.gitignore +++ b/agent-langgraph-long-term-memory/.gitignore @@ -218,3 +218,4 @@ sketch !.claude/skills/lakebase-setup/ !.claude/skills/agent-memory/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ diff --git a/agent-langgraph-short-term-memory/.claude/skills/use-supervisor-api/SKILL.md b/agent-langgraph-short-term-memory/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..4866fd44 --- /dev/null +++ b/agent-langgraph-short-term-memory/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_langgraph_short_term_memory # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-langgraph-short-term-memory/.gitignore b/agent-langgraph-short-term-memory/.gitignore index 1c494c4c..a698549a 100644 --- a/agent-langgraph-short-term-memory/.gitignore +++ b/agent-langgraph-short-term-memory/.gitignore @@ -217,4 +217,5 @@ sketch !.claude/skills/modify-agent/ !.claude/skills/lakebase-setup/ !.claude/skills/agent-memory/ -!.claude/skills/migrate-from-model-serving/ \ No newline at end of file +!.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ \ No newline at end of file diff --git a/agent-langgraph/.claude/skills/use-supervisor-api/SKILL.md b/agent-langgraph/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..a9ac7d45 --- /dev/null +++ b/agent-langgraph/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_langgraph # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-langgraph/.gitignore b/agent-langgraph/.gitignore index 8bc4e76b..16a2f70d 100644 --- a/agent-langgraph/.gitignore +++ b/agent-langgraph/.gitignore @@ -217,6 +217,7 @@ sketch !.claude/skills/lakebase-setup/ !.claude/skills/agent-memory/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ **/.env **/.env.local \ No newline at end of file diff --git a/agent-migration-from-model-serving/.claude/skills/use-supervisor-api/SKILL.md b/agent-migration-from-model-serving/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..adfb9042 --- /dev/null +++ b/agent-migration-from-model-serving/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_migration # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-migration-from-model-serving/.gitignore b/agent-migration-from-model-serving/.gitignore index 9bd156cf..d6c27b9b 100644 --- a/agent-migration-from-model-serving/.gitignore +++ b/agent-migration-from-model-serving/.gitignore @@ -219,6 +219,7 @@ sketch !.claude/skills/lakebase-setup/ !.claude/skills/agent-memory/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ **/.env **/.env.local \ No newline at end of file diff --git a/agent-non-conversational/.claude/skills/use-supervisor-api/SKILL.md b/agent-non-conversational/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..261b6b9f --- /dev/null +++ b/agent-non-conversational/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_non_conversational # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-non-conversational/.gitignore b/agent-non-conversational/.gitignore index 8bc4e76b..16a2f70d 100644 --- a/agent-non-conversational/.gitignore +++ b/agent-non-conversational/.gitignore @@ -217,6 +217,7 @@ sketch !.claude/skills/lakebase-setup/ !.claude/skills/agent-memory/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ **/.env **/.env.local \ No newline at end of file diff --git a/agent-openai-agents-sdk-long-running-agent/.claude/skills/use-supervisor-api/SKILL.md b/agent-openai-agents-sdk-long-running-agent/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..b0a23738 --- /dev/null +++ b/agent-openai-agents-sdk-long-running-agent/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_openai_agents_sdk_long_running_agent # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-openai-agents-sdk-long-running-agent/.gitignore b/agent-openai-agents-sdk-long-running-agent/.gitignore index 9f7d0756..cc094a88 100644 --- a/agent-openai-agents-sdk-long-running-agent/.gitignore +++ b/agent-openai-agents-sdk-long-running-agent/.gitignore @@ -215,6 +215,7 @@ sketch !.claude/skills/run-locally/ !.claude/skills/modify-agent/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ !.claude/skills/agent-memory/ !.claude/skills/lakebase-setup/ diff --git a/agent-openai-agents-sdk-multiagent/.claude/skills/use-supervisor-api/SKILL.md b/agent-openai-agents-sdk-multiagent/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..96bc94af --- /dev/null +++ b/agent-openai-agents-sdk-multiagent/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_openai_agents_sdk_multiagent # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-openai-agents-sdk-multiagent/.gitignore b/agent-openai-agents-sdk-multiagent/.gitignore index 1607735d..65539cd8 100644 --- a/agent-openai-agents-sdk-multiagent/.gitignore +++ b/agent-openai-agents-sdk-multiagent/.gitignore @@ -215,6 +215,7 @@ sketch !.claude/skills/run-locally/ !.claude/skills/modify-agent/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ **/.env **/.env.local diff --git a/agent-openai-agents-sdk/.claude/skills/use-supervisor-api/SKILL.md b/agent-openai-agents-sdk/.claude/skills/use-supervisor-api/SKILL.md new file mode 100644 index 00000000..f36f72ce --- /dev/null +++ b/agent-openai-agents-sdk/.claude/skills/use-supervisor-api/SKILL.md @@ -0,0 +1,183 @@ +--- +name: use-supervisor-api +description: "Replace the client-side agent loop with Databricks Supervisor API (hosted tools). Use when: (1) User asks about Supervisor API, (2) User wants Databricks to run the agent loop server-side, (3) Connecting Genie spaces, UC functions, agent endpoints, or MCP servers as hosted tools." +--- + +# Use the Databricks Supervisor API + +The Supervisor API lets Databricks run the tool-selection and synthesis loop server-side. Instead of your agent managing tool calls and looping, you declare hosted tools and call `responses.create()` — Databricks handles the rest. + +## When to Use + +Use the Supervisor API when you want Databricks to manage the full agent loop for hosted tools: Genie spaces, UC functions, KA (Knowledge Assistant) agent endpoints, or MCP servers via UC connections. + +**Limitations:** +- Cannot mix hosted tools with client-side function tools in the same request +- Inference parameters (e.g., `temperature`, `top_p`) are not supported when tools are passed + +## Step 1: Install `databricks-openai` + +Add to `pyproject.toml` if not already present: + +```toml +[project] +dependencies = [ + ... + "databricks-openai>=0.14.0", + "databricks-sdk>=0.55.0", +] +``` + +Then run `uv sync`. + +## Step 2: Declare Hosted Tools + +Define your tools as a list of dicts. Run `uv run discover-tools` to find available resources in your workspace. + +```python +TOOLS = [ + # Genie space — natural language queries over structured data + { + "type": "genie_space", + "genie_space": { + "description": "Query sales data using natural language", + "space_id": "", + }, + }, + # UC function — SQL or Python UDF + { + "type": "unity_catalog_function", + "unity_catalog_function": { + "name": "..", + "description": "Executes a custom UC function", + }, + }, + # KA (Knowledge Assistant) endpoint — delegates to a Knowledge Assistant agent + # Note: agent_endpoint only supports KA endpoints, not arbitrary agent serving endpoints. + # KA endpoints use a specific ka_query protocol; regular LangGraph/OpenAI agents do not. + { + "type": "agent_endpoint", + "agent_endpoint": { + "name": "my-ka-agent", + "description": "A Knowledge Assistant agent", + "endpoint_name": "", + }, + }, + # External MCP server via UC connection + { + "type": "external_mcp_server", + "external_mcp_server": { + "description": "An external MCP server", + "connection_name": "", + }, + }, +] +``` + +## Step 3: Update `agent_server/agent.py` + +Replace your existing invoke/stream handlers with the Supervisor API pattern. Remove any MCP client setup, LangGraph agents, or OpenAI Agents SDK runner code — the Supervisor API replaces the client-side loop entirely. + +`use_ai_gateway=True` automatically resolves the correct AI Gateway endpoint for the workspace. + +When deployed on Databricks Apps, the platform forwards the authenticated user's token via `x-forwarded-access-token`. Pass this to the Supervisor API so tool calls (e.g., Genie queries) run on behalf of the user rather than the app's service principal. + +```python +import mlflow +from databricks.sdk import WorkspaceClient +from databricks.sdk.config import Config +from databricks_openai import DatabricksOpenAI +from mlflow.genai.agent_server import invoke, stream +from mlflow.types.responses import ( + ResponsesAgentRequest, + ResponsesAgentResponse, +) + +mlflow.openai.autolog() + +MODEL = "databricks-claude-sonnet-4-5" +TOOLS = [...] # From Step 2 + +# Resolve and cache the AI Gateway URL once at module load +_wc = WorkspaceClient() +_client = DatabricksOpenAI(workspace_client=_wc, use_ai_gateway=True) +_ai_gateway_base_url = str(_client.base_url) + + +def _get_client(obo_token: str | None = None) -> DatabricksOpenAI: + """Return a client using the OBO token if provided, else service principal.""" + if obo_token: + obo_wc = WorkspaceClient( + config=Config(host=_wc.config.host, token=obo_token) + ) + return DatabricksOpenAI(workspace_client=obo_wc, base_url=_ai_gateway_base_url) + return _client + + +def _obo_token(request: ResponsesAgentRequest) -> str | None: + return (request.custom_inputs or {}).get("x-forwarded-access-token") + + +@invoke() +def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + response = _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=False, + ) + return ResponsesAgentResponse(output=[item.model_dump() for item in response.output]) + + +@stream() +def stream_handler(request: ResponsesAgentRequest): + mlflow.update_current_trace( + metadata={"mlflow.trace.session": request.context.conversation_id} + ) + return _get_client(_obo_token(request)).responses.create( + model=MODEL, + input=[i.model_dump() for i in request.input], + tools=TOOLS, + stream=True, + ) +``` + +> **OBO note:** The `x-forwarded-access-token` is injected into `custom_inputs` by the app server middleware. No changes are needed to the client — the token arrives automatically when users call your deployed app. + +## Step 4: Grant Permissions in `databricks.yml` + +For each hosted tool, grant the corresponding resource access. See the **add-tools** skill for complete YAML examples. + +| Tool type | Resource to grant | +|-----------|-------------------| +| `genie_space` | `genie_space` with `CAN_RUN` | +| `unity_catalog_function` | `uc_securable` (FUNCTION) with `EXECUTE` | +| `agent_endpoint` | `serving_endpoint` with `CAN_QUERY` (KA endpoints only) | +| `external_mcp_server` | `uc_securable` (CONNECTION) with `USE_CONNECTION` | + +Also grant `CAN_QUERY` on the `MODEL` serving endpoint: + +```yaml +- name: 'model-endpoint' + serving_endpoint: + name: 'databricks-claude-sonnet-4-5' + permission: 'CAN_QUERY' +``` + +## Step 5: Test and Deploy + +```bash +uv run start-app # Test locally +databricks bundle deploy && databricks bundle run agent_openai_agents_sdk # Deploy +``` + +## Troubleshooting + +**"Please ensure AI Gateway V2 is enabled"** — AI Gateway must be enabled for the workspace. Contact your Databricks account team. + +**"Cannot mix hosted and client-side tools"** — Remove any `function`-type tools (Python callables) from `TOOLS`. All tools must be hosted types (`genie_space`, `unity_catalog_function`, `agent_endpoint`, `external_mcp_server`). + +**"Parameter not supported when tools are provided"** — Remove `temperature`, `top_p`, or other inference parameters from the `responses.create()` call. diff --git a/agent-openai-agents-sdk/.gitignore b/agent-openai-agents-sdk/.gitignore index 1607735d..65539cd8 100644 --- a/agent-openai-agents-sdk/.gitignore +++ b/agent-openai-agents-sdk/.gitignore @@ -215,6 +215,7 @@ sketch !.claude/skills/run-locally/ !.claude/skills/modify-agent/ !.claude/skills/migrate-from-model-serving/ +!.claude/skills/use-supervisor-api/ **/.env **/.env.local diff --git a/agent-openai-agents-sdk/e2e-chatbot-app-next/package-lock.json b/agent-openai-agents-sdk/e2e-chatbot-app-next/package-lock.json new file mode 100644 index 00000000..5c93f53a --- /dev/null +++ b/agent-openai-agents-sdk/e2e-chatbot-app-next/package-lock.json @@ -0,0 +1,16 @@ +{ + "name": "e2e-chatbot-app-next", + "lockfileVersion": 3, + "requires": true, + "packages": { + "client": { + "extraneous": true + }, + "packages/core": { + "extraneous": true + }, + "server": { + "extraneous": true + } + } +}