Skip to content

Commit aef9ac9

Browse files
authored
Add web search and fetch tools (#11)
* Add web search and fetch tools * Review fixes * Update usage sink * Fix review comments * Fix WASM build * Add documentation * Add internal web search tool * Review fixes * Fix duplication issue
1 parent c8a13f5 commit aef9ac9

39 files changed

Lines changed: 2905 additions & 122 deletions

File tree

docs/content/docs/configuration/features/index.mdx

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,16 @@ The `[features]` section enables and configures optional gateway capabilities. A
99

1010
## Feature Overview
1111

12-
| Feature | Section | Purpose |
13-
| ----------------------------------------------------------------- | ----------------------------- | ------------------------------------------------- |
14-
| [File Search](/docs/configuration/features/file-search) | `[features.file_search]` | RAG file_search tool for Responses API |
15-
| [File Processing](/docs/configuration/features/file-processing) | `[features.file_processing]` | Document chunking, OCR, virus scanning |
16-
| [Response Caching](/docs/configuration/features/response-caching) | `[features.response_caching]` | Exact and semantic response caching |
17-
| [Guardrails](/docs/configuration/features/guardrails) | `[features.guardrails]` | Content filtering, PII detection, safety |
18-
| [Image Fetching](/docs/configuration/features/image-fetching) | `[features.image_fetching]` | URL-to-base64 conversion for non-OpenAI providers |
19-
| [WebSocket](/docs/configuration/features/websocket) | `[features.websocket]` | Real-time event subscriptions |
20-
| Model Catalog | `[features.model_catalog]` | Enrich models with capabilities and pricing |
12+
| Feature | Section | Purpose |
13+
| ----------------------------------------------------------------- | ------------------------------------------------ | ------------------------------------------------- |
14+
| [File Search](/docs/configuration/features/file-search) | `[features.file_search]` | RAG file_search tool for Responses API |
15+
| [File Processing](/docs/configuration/features/file-processing) | `[features.file_processing]` | Document chunking, OCR, virus scanning |
16+
| [Response Caching](/docs/configuration/features/response-caching) | `[features.response_caching]` | Exact and semantic response caching |
17+
| [Guardrails](/docs/configuration/features/guardrails) | `[features.guardrails]` | Content filtering, PII detection, safety |
18+
| [Image Fetching](/docs/configuration/features/image-fetching) | `[features.image_fetching]` | URL-to-base64 conversion for non-OpenAI providers |
19+
| [WebSocket](/docs/configuration/features/websocket) | `[features.websocket]` | Real-time event subscriptions |
20+
| [Web Tools](/docs/configuration/features/web-tools) | `[features.web_search]` / `[features.web_fetch]` | Web search and URL fetching for chat UI |
21+
| Model Catalog | `[features.model_catalog]` | Enrich models with capabilities and pricing |
2122

2223
## Minimal Configuration
2324

@@ -108,6 +109,17 @@ require_auth = true
108109
enabled = true
109110
sync_interval_secs = 1800
110111
api_url = "https://models.dev/api.json"
112+
113+
# Web Search
114+
[features.web_search]
115+
provider = "tavily"
116+
api_key = "${TAVILY_API_KEY}"
117+
max_results = 10
118+
119+
# Web Fetch
120+
[features.web_fetch]
121+
max_response_bytes = 1048576
122+
timeout_secs = 30
111123
```
112124

113125
## Model Catalog
@@ -169,6 +181,7 @@ Some features have dependencies on other configuration:
169181
| Semantic Caching | Vector backend (pgvector or Qdrant) |
170182
| Guardrails (Bedrock) | AWS credentials |
171183
| WebSocket auth | Authentication configuration |
184+
| Web Search | Search provider API key (Tavily or Exa) |
172185

173186
<Callout type="info">
174187
For conceptual documentation on how each feature works, see the [Features Guide](/docs/features).

docs/content/docs/configuration/features/meta.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
"response-caching",
88
"guardrails",
99
"image-fetching",
10+
"web-tools",
1011
"websocket"
1112
]
1213
}
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
---
2+
title: Web Tools
3+
description: Configure server-side web search and URL fetching tools
4+
---
5+
6+
import { Callout } from "fumadocs-ui/components/callout";
7+
8+
The `[features.web_search]` and `[features.web_fetch]` sections configure server-side web tools used by the chat UI. Web search requires an external provider API key; web fetch works out of the box.
9+
10+
## Web Search
11+
12+
### Configuration Reference
13+
14+
```toml
15+
[features.web_search]
16+
provider = "tavily"
17+
api_key = "${TAVILY_API_KEY}"
18+
max_results = 10
19+
timeout_secs = 30
20+
cost_microcents_per_request = 10000
21+
```
22+
23+
| Key | Type | Default | Description |
24+
| ----------------------------- | ------- | ------- | -------------------------------------------------------- |
25+
| `provider` | string || Search provider: `"tavily"` or `"exa"` (required) |
26+
| `api_key` | string || Provider API key, supports `${ENV_VAR}` (required) |
27+
| `max_results` | integer | `10` | Maximum results per search (also caps per-request limit) |
28+
| `timeout_secs` | integer | `30` | Request timeout in seconds |
29+
| `cost_microcents_per_request` | integer | `10000` | Cost per request in microcents ($0.01 default) |
30+
31+
<Callout type="info">
32+
Omit the `[features.web_search]` section entirely to disable web search. There is no separate
33+
`enabled` flag — the presence of the section enables the feature.
34+
</Callout>
35+
36+
### Provider Setup
37+
38+
#### Tavily
39+
40+
1. Sign up at [tavily.com](https://tavily.com) and obtain an API key
41+
2. Set the environment variable or inline the key:
42+
43+
```toml
44+
[features.web_search]
45+
provider = "tavily"
46+
api_key = "${TAVILY_API_KEY}"
47+
```
48+
49+
#### Exa
50+
51+
1. Sign up at [exa.ai](https://exa.ai) and obtain an API key
52+
2. Configure with the `exa` provider:
53+
54+
```toml
55+
[features.web_search]
56+
provider = "exa"
57+
api_key = "${EXA_API_KEY}"
58+
```
59+
60+
Exa returns full-text content for each result rather than snippets.
61+
62+
### Cost Tracking
63+
64+
Each search request is logged with the configured `cost_microcents_per_request`. The default of `10000` represents $0.01 per search. Adjust to match your provider's actual pricing.
65+
66+
## Web Fetch
67+
68+
### Configuration Reference
69+
70+
```toml
71+
[features.web_fetch]
72+
enabled = true
73+
max_response_bytes = 1048576
74+
timeout_secs = 30
75+
allowed_content_types = ["text/html", "text/plain", "application/json", "application/xml", "text/xml", "text/csv", "text/markdown"]
76+
cost_microcents_per_request = 0
77+
```
78+
79+
| Key | Type | Default | Description |
80+
| ----------------------------- | -------- | ---------------- | ------------------------------------------------ |
81+
| `enabled` | boolean | `true` | Enable/disable the web fetch tool |
82+
| `max_response_bytes` | integer | `1048576` (1 MB) | Maximum response body size in bytes |
83+
| `timeout_secs` | integer | `30` | Request timeout in seconds |
84+
| `allowed_content_types` | string[] | See above | Content types to accept (prefix match) |
85+
| `cost_microcents_per_request` | integer | `0` | Cost per request in microcents (free by default) |
86+
87+
<Callout type="info">
88+
Omit the `[features.web_fetch]` section entirely to disable web fetch. When present, set `enabled
89+
= false` to temporarily disable without removing the configuration.
90+
</Callout>
91+
92+
### Security
93+
94+
Web fetch includes multiple layers of SSRF protection:
95+
96+
| Protection | Description |
97+
| ----------------- | -------------------------------------------------------------- |
98+
| URL validation | Blocks private/loopback IPs by default |
99+
| DNS pinning | Resolves DNS once and pins the connection to prevent rebinding |
100+
| Redirect blocking | Rejects HTTP redirects to prevent SSRF via redirect chains |
101+
| Content filtering | Only fetches allowed content types |
102+
| Size limits | Truncates responses at `max_response_bytes` |
103+
104+
To allow fetching from private/loopback addresses (development only):
105+
106+
```toml
107+
[server]
108+
allow_loopback_urls = true
109+
allow_private_urls = true
110+
```
111+
112+
### Content Type Filtering
113+
114+
The `allowed_content_types` list uses prefix matching. For example, `"text/html"` matches `"text/html; charset=utf-8"`. An empty list allows all content types (not recommended).
115+
116+
## Complete Examples
117+
118+
### Development
119+
120+
```toml
121+
[features.web_search]
122+
provider = "tavily"
123+
api_key = "${TAVILY_API_KEY}"
124+
max_results = 5
125+
timeout_secs = 10
126+
127+
[features.web_fetch]
128+
max_response_bytes = 524288
129+
timeout_secs = 15
130+
```
131+
132+
### Production
133+
134+
```toml
135+
[features.web_search]
136+
provider = "tavily"
137+
api_key = "${TAVILY_API_KEY}"
138+
max_results = 10
139+
timeout_secs = 30
140+
cost_microcents_per_request = 10000
141+
142+
[features.web_fetch]
143+
max_response_bytes = 1048576
144+
timeout_secs = 30
145+
cost_microcents_per_request = 0
146+
allowed_content_types = ["text/html", "text/plain", "application/json"]
147+
```
148+
149+
### Web Fetch Only
150+
151+
```toml
152+
# No [features.web_search] — web search disabled
153+
154+
[features.web_fetch]
155+
max_response_bytes = 2097152
156+
timeout_secs = 60
157+
```
158+
159+
## Next Steps
160+
161+
- [Web Tools Feature Guide](/docs/features/web-tools) — How web tools work in the chat UI
162+
- [Frontend Tools](/docs/features/frontend-tools) — Client-side tools (Python, JS, SQL, charts)
163+
- [Authorization](/docs/features/authorization) — RBAC policies for tool access

docs/content/docs/features/frontend-tools.mdx

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -448,16 +448,18 @@ Tools are enabled per-conversation via the toolbar in the chat interface.
448448

449449
### Tool Dependencies
450450

451-
| Tool | Requires |
452-
| ----------- | ---------------------------- |
453-
| Python | None |
454-
| JavaScript | None |
455-
| SQL | Uploaded data files |
456-
| Charts | None (data embedded in spec) |
457-
| HTML | None |
458-
| Wikipedia | Internet connection |
459-
| Wikidata | Internet connection |
460-
| File Search | Attached vector store |
451+
| Tool | Requires |
452+
| ----------- | ------------------------------------------------------------------------------- |
453+
| Python | None |
454+
| JavaScript | None |
455+
| SQL | Uploaded data files |
456+
| Charts | None (data embedded in spec) |
457+
| HTML | None |
458+
| Wikipedia | Internet connection |
459+
| Wikidata | Internet connection |
460+
| File Search | Attached vector store |
461+
| Web Search | [Backend configuration](/docs/configuration/features/web-tools) (Tavily or Exa) |
462+
| Web Fetch | [Backend configuration](/docs/configuration/features/web-tools) |
461463

462464
## Execution Flow
463465

docs/content/docs/features/index.mdx

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,22 @@ Tool results are displayed inline as interactive artifacts and sent back to the
240240
<Card title="Frontend Tools" href="/docs/features/frontend-tools" />
241241
</Cards>
242242

243+
## Web Tools
244+
245+
Server-side web search and URL fetching, proxied through the gateway with SSRF protection and usage tracking.
246+
247+
| Tool | Provider | Capabilities |
248+
| ---------- | ------------- | --------------------------------------- |
249+
| Web Search | Tavily or Exa | Search the web, returns ranked results |
250+
| Web Fetch | Direct HTTP | Fetch URLs, HTML stripped to plain text |
251+
252+
Web search results appear as inline citations. Both tools require backend configuration.
253+
254+
<Cards>
255+
<Card title="Web Tools" href="/docs/features/web-tools" />
256+
<Card title="Web Tools Configuration" href="/docs/configuration/features/web-tools" />
257+
</Cards>
258+
243259
## MCP Integration
244260

245261
Connect to external tool servers using the Model Context Protocol (MCP).

docs/content/docs/features/meta.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
"knowledge-bases",
1717
"chat-modes",
1818
"frontend-tools",
19+
"web-tools",
1920
"guardrails",
2021
"mcp",
2122
"mcp-agents",

0 commit comments

Comments
 (0)