Tavily web search and extract for AppKit apps

## Summary

We propose adding a **Tavily plugin** to AppKit, giving Databricks Apps first-class access to real-time web data: search, content extraction, and (later) crawling and deep research. [Tavily](https://tavily.com) is a web access API built specifically for LLMs and agents — results come back as clean, relevance-scored, LLM-ready content rather than raw HTML.

We're the Tavily team, and **we'd like to contribute this plugin ourselves** — this issue is to align on scope and design before we open the PR.

## Motivation

AppKit apps are strong on Lakehouse-native data (analytics, Genie, vector search) but have no built-in way to reach *outside* the workspace. Common patterns this unlocks:

- **Grounded AI agents** — the agents plugin resolves `plugin:NAME` tool providers; a Tavily plugin exposing a `toolkit()` would let any AppKit agent do web search/extraction with one line of frontmatter (`tools: [plugin:tavily]`), complementing Genie and vector-search for questions that need current, external information.
- **RAG enrichment** — combine vector-search results over internal docs with fresh web context in serving/agents flows.
- **Data enrichment apps** — enrich entities from Lakehouse tables (companies, products, tickers) with live web data.

## Proposed design

A core plugin following the existing conventions (`packages/appkit/src/plugins/tavily/`):

**Manifest** — one required `secret` resource (permission: `READ`) holding the Tavily API key, surfaced via a `TAVILY_API_KEY` env field, so `databricks apps init` and `appkit plugin sync` wire it up like any other resource. No OBO semantics — the key is app-level, like a service principal credential.

**Server API** — typed methods plus injected routes:

```typescript
const app = await createApp({ plugins: [new TavilyPlugin({ /* defaults */ })] });

await app.tavily.search("latest EU AI Act enforcement actions", {
  maxResults: 5,
  timeRange: "month",
});
await app.tavily.extract(["https://example.com/report"], { format: "markdown" });
```

- `POST /api/tavily/search` and `POST /api/tavily/extract`, validated with Zod like other plugins.
- All outbound calls go through `this.execute()`, so caching (search results are very cacheable), retry, timeout, and OpenTelemetry tracing come for free from the interceptor chain.

**Agents integration** — implement the `ToolProvider` contract (`toolkit()`), so agents get `tavily_search` / `tavily_extract` tools, subject to the existing tool-approval gate.

**Config schema (sketch)** —

```jsonc
{
  "apiKey": { /* from secret resource / TAVILY_API_KEY */ },
  "search": { "maxResults": 5, "searchDepth": "basic", "includeDomains": [], "excludeDomains": [] },
  "cache": { "ttl": 300000 },
  "timeout": 30000
}
```

## Scope

- **v1:** `search` + `extract`, manifest, config schema, agents `toolkit()`, tests, docs page.
- **Later (separate PRs):** `crawl` / `map`, deep `research` (long-running — fits the SSE streaming machinery), and optionally an `appkit-ui` results component.

## Open questions for maintainers

1. **Packaging** — core plugin in `packages/appkit` (like genie/serving), or would you prefer third-party integrations in a separate package? This would be the first non-Databricks-service plugin, so happy to follow whatever precedent you want to set.
2. **Beta gating** — should it ship via `beta-exports` initially?
3. **Dependency policy** — we'd use the official `@tavily/core` SDK (MIT); fine, or do you prefer plain `fetch` for supply-chain reasons?

If this direction sounds good, we'll follow up with the PR.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tavily web search and extract for AppKit apps #468

Summary

Motivation

Proposed design

Scope

Open questions for maintainers

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Tavily web search and extract for AppKit apps #468

Description

Summary

Motivation

Proposed design

Scope

Open questions for maintainers

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions