diff --git a/docs/external-dependencies.md b/docs/external-dependencies.md index a28447e7c..756be5144 100644 --- a/docs/external-dependencies.md +++ b/docs/external-dependencies.md @@ -19,7 +19,7 @@ | 11 | BigQuery Metrics | `api.anthropic.com/api/claude_code/metrics` | HTTPS | 默认启用 | | 12 | MCP Proxy | `mcp-proxy.anthropic.com` | HTTPS+WS | 使用 MCP 工具时 | | 13 | MCP Registry | `api.anthropic.com/mcp-registry` | HTTPS | 查询 MCP 服务器时 | -| 14 | Bing Search | `www.bing.com` | HTTPS | WebSearch 工具 | +| 14 | Web Search Pages | `www.bing.com`, `search.brave.com` | HTTPS | WebSearch 工具,可通过 `WEB_SEARCH_ADAPTER=bing|brave` 切换 | | 15 | Google Cloud Storage (更新) | `storage.googleapis.com` | HTTPS | 版本检查 | | 16 | GitHub Raw (Changelog/Stats) | `raw.githubusercontent.com` | HTTPS | 更新提示 | | 17 | Claude in Chrome Bridge | `bridge.claudeusercontent.com` | WSS | Chrome 集成 | @@ -121,12 +121,16 @@ Anthropic 托管的 MCP 服务器代理。 - **端点**: `https://api.anthropic.com/mcp-registry/v0/servers?version=latest&visibility=commercial` - **文件**: `src/services/mcp/officialRegistry.ts` -### 14. Bing Search +### 14. Web Search Pages -WebSearch 工具的默认适配器,抓取 Bing 搜索结果。 +WebSearch 工具支持直接抓取 Bing 搜索结果页面,也支持通过 Brave 的 LLM Context API +获取搜索上下文;可通过 `WEB_SEARCH_ADAPTER=bing|brave` 显式切换后端。 -- **端点**: `https://www.bing.com/search?q={query}&setmkt=en-US` -- **文件**: `src/tools/WebSearchTool/adapters/bingAdapter.ts` +- **Bing 端点**: `https://www.bing.com/search?q={query}&setmkt=en-US` +- **Brave 端点**: `https://api.search.brave.com/res/v1/llm/context?q={query}` +- **文件**: + - `src/tools/WebSearchTool/adapters/bingAdapter.ts` + - `src/tools/WebSearchTool/adapters/braveAdapter.ts` 另外还有 Domain Blocklist 查询: - **端点**: `https://api.anthropic.com/api/web/domain_info?domain={domain}` @@ -201,6 +205,7 @@ WebSearch 工具的默认适配器,抓取 Bing 搜索结果。 | `{region}-aiplatform.googleapis.com` | Google Vertex AI | HTTPS | | `{resource}.services.ai.azure.com` | Azure Foundry | HTTPS | | `www.bing.com` | Bing 搜索 | HTTPS | +| `search.brave.com` | Brave 搜索 | HTTPS | | `storage.googleapis.com` | 自动更新 | HTTPS | | `raw.githubusercontent.com` | Changelog / 插件统计 | HTTPS | | `bridge.claudeusercontent.com` | Chrome Bridge | WSS | diff --git a/docs/features/web-search-tool.md b/docs/features/web-search-tool.md index 84802cc2b..5a6db8c34 100644 --- a/docs/features/web-search-tool.md +++ b/docs/features/web-search-tool.md @@ -1,11 +1,11 @@ # WEB_SEARCH_TOOL — 网页搜索工具 -> 实现状态:适配器架构完成,Bing 适配器为当前默认后端 +> 实现状态:适配器架构完成,支持 API / Bing / Brave 三种后端 > 引用数:核心工具,无 feature flag 门控(始终启用) ## 一、功能概述 -WebSearchTool 让模型可以搜索互联网获取最新信息。原始实现仅支持 Anthropic API 服务端搜索(`web_search_20250305` server tool),在第三方代理端点下不可用。现已重构为适配器架构,新增 Bing 搜索页面解析作为 fallback,确保任何 API 端点都能使用搜索功能。 +WebSearchTool 让模型可以搜索互联网获取最新信息。原始实现仅支持 Anthropic API 服务端搜索(`web_search_20250305` server tool),在第三方代理端点下不可用。现已重构为适配器架构,支持 API 服务端搜索,以及 Bing / Brave 两个 HTML 解析后端,确保任何 API 端点都能使用搜索功能。 ## 二、实现架构 @@ -21,9 +21,13 @@ WebSearchTool.call() │ └── 使用 web_search_20250305 server tool │ 通过 queryModelWithStreaming 二次调用 API │ - └── BingSearchAdapter — Bing HTML 抓取 + 正则提取(当前默认) - └── 直接抓取 Bing 搜索页 HTML - 正则提取 b_algo 块中的标题/URL/摘要 + ├── BingSearchAdapter — Bing HTML 抓取 + 正则提取 + │ └── 直接抓取 Bing 搜索页 HTML + │ 正则提取 b_algo 块中的标题/URL/摘要 + │ + └── BraveSearchAdapter — Brave LLM Context API + └── 调用 Brave HTTPS GET 接口 + 将 grounding payload 映射为标题/URL/摘要 ``` ### 2.2 模块结构 @@ -37,8 +41,9 @@ WebSearchTool.call() | 适配器工厂 | `src/tools/WebSearchTool/adapters/index.ts` | `createAdapter()` 工厂函数,选择后端 | | API 适配器 | `src/tools/WebSearchTool/adapters/apiAdapter.ts` | 封装原有 `queryModelWithStreaming` 逻辑,使用 server tool | | Bing 适配器 | `src/tools/WebSearchTool/adapters/bingAdapter.ts` | Bing HTML 抓取 + 正则解析 | -| 单元测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.test.ts` | 32 个测试用例 | -| 集成测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.integration.ts` | 真实网络请求验证 | +| Brave 适配器 | `src/tools/WebSearchTool/adapters/braveAdapter.ts` | Brave LLM Context API 适配与结果映射 | +| 单元测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.test.ts`, `src/tools/WebSearchTool/__tests__/braveAdapter*.test.ts`, `src/tools/WebSearchTool/__tests__/adapterFactory.test.ts` | Bing / Brave 解析与工厂逻辑测试 | +| 集成测试 | `src/tools/WebSearchTool/__tests__/bingAdapter.integration.ts`, `src/tools/WebSearchTool/__tests__/braveAdapter.integration.ts` | 真实网络请求验证 | ### 2.3 数据流 @@ -49,20 +54,18 @@ WebSearchTool.call() validateInput() — 校验 query 非空、allowed/block 不共存 │ ▼ - createAdapter() → BingSearchAdapter(当前硬编码) + createAdapter() → ApiSearchAdapter | BingSearchAdapter | BraveSearchAdapter │ ▼ adapter.search(query, { allowedDomains, blockedDomains, signal, onProgress }) │ ├── onProgress({ type: 'query_update', query }) │ - ├── axios.get(bing.com/search?q=...&setmkt=en-US) - │ └── 13 个 Edge 浏览器请求头 + ├── axios.get(search-engine-url) + │ └── API 鉴权请求头 │ - ├── extractBingResults(html) — 正则提取
  • 块 - │ ├── resolveBingUrl() — 解码 base64 重定向 URL - │ ├── extractSnippet() — 三级降级摘要提取 - │ └── decodeHtmlEntities() — he.decode + ├── extractResults(payload) — 按后端提取结果 + │ └── grounding → SearchResult[] 映射 │ ├── 客户端域名过滤 (allowedDomains / blockedDomains) │ @@ -117,19 +120,18 @@ Bing 返回的重定向 URL 格式:`bing.com/ck/a?...&u=a1aHR0cHM6Ly9...` ## 四、适配器选择逻辑 -当前 `createAdapter()` 硬编码返回 `BingSearchAdapter`,原逻辑已注释保留: +`createAdapter()` 按以下优先级选择后端,并按选中的后端 key 缓存适配器实例: ```typescript export function createAdapter(): WebSearchAdapter { - return new BingSearchAdapter() - // 注释保留的选择逻辑: - // 1. WEB_SEARCH_ADAPTER 环境变量强制指定 api|bing - // 2. isFirstPartyAnthropicBaseUrl() → API 适配器 - // 3. 第三方端点 → Bing 适配器 + // 1. WEB_SEARCH_ADAPTER=api|bing|brave 显式指定 + // 2. Anthropic 官方 API Base URL → ApiSearchAdapter + // 3. 第三方代理 / 非官方端点 → BingSearchAdapter } ``` -恢复自动选择:取消 `index.ts` 中的注释即可。 +显式指定 `WEB_SEARCH_ADAPTER=brave` 时,会改用 Brave LLM Context API 后端,并要求 +`BRAVE_SEARCH_API_KEY` 或 `BRAVE_API_KEY`。 ## 五、接口定义 diff --git a/docs/tools/search-and-navigation.mdx b/docs/tools/search-and-navigation.mdx index 99393748e..9422ea177 100644 --- a/docs/tools/search-and-navigation.mdx +++ b/docs/tools/search-and-navigation.mdx @@ -146,14 +146,15 @@ AI 的信息获取不局限于本地代码: ### WebSearch 实现机制 -WebSearch 通过适配器模式支持两种搜索后端,由 `src/tools/WebSearchTool/adapters/` 中的工厂函数 `createAdapter()` 选择: +WebSearch 通过适配器模式支持三种搜索后端,由 `src/tools/WebSearchTool/adapters/` 中的工厂函数 `createAdapter()` 选择: ``` 适配器架构: WebSearchTool.call() → createAdapter() 选择后端 ├─ ApiSearchAdapter — Anthropic API 服务端搜索(需官方 API 密钥) - └─ BingSearchAdapter — 直接抓取 Bing 搜索页面解析(无需 API 密钥) + ├─ BingSearchAdapter — 直接抓取 Bing 搜索页面解析(无需 API 密钥) + └─ BraveSearchAdapter — 调用 Brave LLM Context API 解析(需 Brave API 密钥) → adapter.search(query, options) → 转换为统一 SearchResult[] 格式返回 ``` @@ -166,8 +167,9 @@ WebSearch 通过适配器模式支持两种搜索后端,由 `src/tools/WebSear |--------|------|--------| | 1 | 环境变量 `WEB_SEARCH_ADAPTER=api` | `ApiSearchAdapter` | | 2 | 环境变量 `WEB_SEARCH_ADAPTER=bing` | `BingSearchAdapter` | -| 3 | API Base URL 指向 Anthropic 官方 | `ApiSearchAdapter` | -| 4 | 第三方代理 / 非官方端点 | `BingSearchAdapter` | +| 3 | 环境变量 `WEB_SEARCH_ADAPTER=brave` | `BraveSearchAdapter` | +| 4 | API Base URL 指向 Anthropic 官方 | `ApiSearchAdapter` | +| 5 | 第三方代理 / 非官方端点 | `BingSearchAdapter` | 适配器是无状态的,同一会话内缓存复用。 diff --git a/src/tools/WebSearchTool/__tests__/adapterFactory.test.ts b/src/tools/WebSearchTool/__tests__/adapterFactory.test.ts new file mode 100644 index 000000000..d93b255b4 --- /dev/null +++ b/src/tools/WebSearchTool/__tests__/adapterFactory.test.ts @@ -0,0 +1,70 @@ +import { afterEach, describe, expect, mock, test } from 'bun:test' + +let isFirstPartyBaseUrl = true + +mock.module('../adapters/apiAdapter.js', () => ({ + ApiSearchAdapter: class ApiSearchAdapter {}, +})) + +mock.module('../adapters/bingAdapter.js', () => ({ + BingSearchAdapter: class BingSearchAdapter {}, +})) + +mock.module('../adapters/braveAdapter.js', () => ({ + BraveSearchAdapter: class BraveSearchAdapter {}, +})) + +mock.module('../../../utils/model/providers.js', () => ({ + isFirstPartyAnthropicBaseUrl: () => isFirstPartyBaseUrl, +})) + +const { createAdapter } = await import('../adapters/index') + +const originalWebSearchAdapter = process.env.WEB_SEARCH_ADAPTER + +afterEach(() => { + isFirstPartyBaseUrl = true + + if (originalWebSearchAdapter === undefined) { + delete process.env.WEB_SEARCH_ADAPTER + } else { + process.env.WEB_SEARCH_ADAPTER = originalWebSearchAdapter + } +}) + +describe('createAdapter', () => { + test('reuses the same instance when the selected backend does not change', () => { + process.env.WEB_SEARCH_ADAPTER = 'brave' + + const firstAdapter = createAdapter() + const secondAdapter = createAdapter() + + expect(firstAdapter).toBe(secondAdapter) + expect(firstAdapter.constructor.name).toBe('BraveSearchAdapter') + }) + + test('rebuilds the adapter when WEB_SEARCH_ADAPTER changes', () => { + process.env.WEB_SEARCH_ADAPTER = 'brave' + const braveAdapter = createAdapter() + + process.env.WEB_SEARCH_ADAPTER = 'bing' + const bingAdapter = createAdapter() + + expect(bingAdapter).not.toBe(braveAdapter) + expect(bingAdapter.constructor.name).toBe('BingSearchAdapter') + }) + + test('selects the API adapter for first-party Anthropic URLs', () => { + delete process.env.WEB_SEARCH_ADAPTER + isFirstPartyBaseUrl = true + + expect(createAdapter().constructor.name).toBe('ApiSearchAdapter') + }) + + test('selects the Bing adapter for third-party Anthropic base URLs', () => { + delete process.env.WEB_SEARCH_ADAPTER + isFirstPartyBaseUrl = false + + expect(createAdapter().constructor.name).toBe('BingSearchAdapter') + }) +}) diff --git a/src/tools/WebSearchTool/__tests__/braveAdapter.extract.test.ts b/src/tools/WebSearchTool/__tests__/braveAdapter.extract.test.ts new file mode 100644 index 000000000..f891ce3ca --- /dev/null +++ b/src/tools/WebSearchTool/__tests__/braveAdapter.extract.test.ts @@ -0,0 +1,106 @@ +import { describe, expect, test } from 'bun:test' +import { extractBraveResults } from '../adapters/braveAdapter' + +describe('extractBraveResults', () => { + test('extracts generic grounding results', () => { + const results = extractBraveResults({ + grounding: { + generic: [ + { + title: 'Example Title 1', + url: 'https://example.com/page1', + snippets: ['First result description'], + }, + { + title: 'Example Title 2', + url: 'https://example.com/page2', + snippets: ['Second result description'], + }, + ], + }, + }) + + expect(results).toEqual([ + { + title: 'Example Title 1', + url: 'https://example.com/page1', + snippet: 'First result description', + }, + { + title: 'Example Title 2', + url: 'https://example.com/page2', + snippet: 'Second result description', + }, + ]) + }) + + test('combines generic, poi, and map grounding results', () => { + const results = extractBraveResults({ + grounding: { + generic: [{ title: 'Generic', url: 'https://example.com/generic' }], + poi: { title: 'POI', url: 'https://maps.example.com/poi' }, + map: [{ title: 'Map', url: 'https://maps.example.com/map' }], + }, + }) + + expect(results).toEqual([ + { title: 'Generic', url: 'https://example.com/generic', snippet: undefined }, + { title: 'POI', url: 'https://maps.example.com/poi', snippet: undefined }, + { title: 'Map', url: 'https://maps.example.com/map', snippet: undefined }, + ]) + }) + + test('joins multiple snippets into one summary string', () => { + const results = extractBraveResults({ + grounding: { + generic: [ + { + title: 'Joined Snippets', + url: 'https://example.com/joined', + snippets: ['First snippet.', 'Second snippet.'], + }, + ], + }, + }) + + expect(results[0].snippet).toBe('First snippet. Second snippet.') + }) + + test('skips entries without a title or URL', () => { + const results = extractBraveResults({ + grounding: { + generic: [ + { title: 'Missing URL' }, + { url: 'https://example.com/missing-title' }, + { title: 'Valid', url: 'https://example.com/valid' }, + ], + }, + }) + + expect(results).toEqual([ + { title: 'Valid', url: 'https://example.com/valid', snippet: undefined }, + ]) + }) + + test('deduplicates repeated URLs across grounding buckets', () => { + const results = extractBraveResults({ + grounding: { + generic: [{ title: 'First', url: 'https://example.com/dup' }], + poi: { title: 'Second', url: 'https://example.com/dup' }, + map: [{ title: 'Third', url: 'https://example.com/dup' }], + }, + }) + + expect(results).toEqual([ + { title: 'First', url: 'https://example.com/dup', snippet: undefined }, + ]) + }) + + test('returns empty array when grounding is missing', () => { + expect(extractBraveResults({})).toEqual([]) + }) + + test('returns empty array when grounding arrays are absent', () => { + expect(extractBraveResults({ grounding: {} })).toEqual([]) + }) +}) diff --git a/src/tools/WebSearchTool/__tests__/braveAdapter.integration.ts b/src/tools/WebSearchTool/__tests__/braveAdapter.integration.ts new file mode 100644 index 000000000..f7dc6e653 --- /dev/null +++ b/src/tools/WebSearchTool/__tests__/braveAdapter.integration.ts @@ -0,0 +1,91 @@ +/** + * Integration test for BraveSearchAdapter — hits Brave's LLM context API. + * + * Usage: + * BRAVE_SEARCH_API_KEY=... bun run src/tools/WebSearchTool/__tests__/braveAdapter.integration.ts + * + * Optional env vars: + * BRAVE_QUERY — search query (default: "Claude AI Anthropic") + * BRAVE_API_KEY — fallback key env var + */ + +if (!globalThis.MACRO) { + globalThis.MACRO = { VERSION: '0.0.0-test', BUILD_TIME: '0' } as any +} + +import { BraveSearchAdapter } from '../adapters/braveAdapter' + +const query = process.env.BRAVE_QUERY || 'Claude AI Anthropic' + +async function main() { + if (!process.env.BRAVE_SEARCH_API_KEY && !process.env.BRAVE_API_KEY) { + console.error( + '❌ Missing Brave API key. Set BRAVE_SEARCH_API_KEY or BRAVE_API_KEY.', + ) + process.exit(1) + } + + console.log(`\n🔍 Searching Brave for: "${query}"\n`) + + const adapter = new BraveSearchAdapter() + const startTime = Date.now() + + const results = await adapter.search(query, { + onProgress: p => { + if (p.type === 'query_update') { + console.log(` → Query sent: ${p.query}`) + } + if (p.type === 'search_results_received') { + console.log(` → Received ${p.resultCount} results`) + } + }, + }) + + const elapsed = Date.now() - startTime + console.log(`\n✅ Done in ${elapsed}ms — ${results.length} result(s)\n`) + + if (results.length === 0) { + console.log('⚠️ No results returned. Possible causes:') + console.log(' - Brave returned no grounding data for the query') + console.log(' - Network/firewall issue') + console.log(' - Invalid or rate-limited Brave API key\n') + process.exit(1) + } + + for (const [i, r] of results.entries()) { + console.log(` ${i + 1}. ${r.title}`) + console.log(` ${r.url}`) + if (r.snippet) { + const snippet = r.snippet.replace(/\n/g, ' ') + console.log( + ` ${snippet.slice(0, 150)}${snippet.length > 150 ? '…' : ''}`, + ) + } + console.log() + } + + let passed = true + for (const [i, r] of results.entries()) { + if (!r.title || typeof r.title !== 'string') { + console.error(`❌ Result ${i + 1}: missing or non-string title`, r) + passed = false + } + if (!r.url || !r.url.startsWith('http')) { + console.error(`❌ Result ${i + 1}: missing or non-http url`, r) + passed = false + } + } + + if (passed) { + console.log('✅ All results have valid structure.\n') + } else { + process.exit(1) + } +} + +if (import.meta.main) { + main().catch(e => { + console.error('❌ Fatal error:', e) + process.exit(1) + }) +} diff --git a/src/tools/WebSearchTool/__tests__/braveAdapter.test.ts b/src/tools/WebSearchTool/__tests__/braveAdapter.test.ts new file mode 100644 index 000000000..8158e6dde --- /dev/null +++ b/src/tools/WebSearchTool/__tests__/braveAdapter.test.ts @@ -0,0 +1,273 @@ +import { afterEach, beforeEach, describe, expect, mock, test } from 'bun:test' + +const originalBraveSearchApiKey = process.env.BRAVE_SEARCH_API_KEY +const originalBraveApiKey = process.env.BRAVE_API_KEY + +describe('BraveSearchAdapter.search', () => { + const createAdapter = async () => { + const { BraveSearchAdapter } = await import('../adapters/braveAdapter') + return new BraveSearchAdapter() + } + + const SAMPLE_RESPONSE = { + grounding: { + generic: [ + { + title: 'Result One', + url: 'https://example.com/result1', + snippets: ['Snippet one'], + }, + { + title: 'Result Two', + url: 'https://example.com/result2', + snippets: ['Snippet two'], + }, + ], + }, + } + + beforeEach(() => { + process.env.BRAVE_SEARCH_API_KEY = 'test-brave-key' + delete process.env.BRAVE_API_KEY + }) + + afterEach(() => { + mock.restore() + + if (originalBraveSearchApiKey === undefined) { + delete process.env.BRAVE_SEARCH_API_KEY + } else { + process.env.BRAVE_SEARCH_API_KEY = originalBraveSearchApiKey + } + + if (originalBraveApiKey === undefined) { + delete process.env.BRAVE_API_KEY + } else { + process.env.BRAVE_API_KEY = originalBraveApiKey + } + }) + + test('returns parsed results from Brave LLM context payload', async () => { + mock.module('axios', () => ({ + default: { + get: mock(() => Promise.resolve({ data: SAMPLE_RESPONSE })), + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + const results = await adapter.search('test query', {}) + + expect(results).toHaveLength(2) + expect(results[0]).toEqual({ + title: 'Result One', + url: 'https://example.com/result1', + snippet: 'Snippet one', + }) + expect(results[1].title).toBe('Result Two') + }) + + test('calls onProgress with query_update and search_results_received', async () => { + mock.module('axios', () => ({ + default: { + get: mock(() => Promise.resolve({ data: SAMPLE_RESPONSE })), + isCancel: () => false, + }, + })) + + const progressCalls: any[] = [] + const onProgress = (p: any) => progressCalls.push(p) + + const adapter = await createAdapter() + await adapter.search('test', { onProgress }) + + expect(progressCalls).toHaveLength(2) + expect(progressCalls[0]).toEqual({ + type: 'query_update', + query: 'test', + }) + expect(progressCalls[1]).toEqual({ + type: 'search_results_received', + resultCount: 2, + query: 'test', + }) + }) + + test('filters results by allowedDomains', async () => { + const mixedResponse = { + grounding: { + generic: [ + { title: 'Allowed', url: 'https://allowed.com/a' }, + { title: 'Blocked', url: 'https://blocked.com/b' }, + ], + }, + } + + mock.module('axios', () => ({ + default: { + get: mock(() => Promise.resolve({ data: mixedResponse })), + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + const results = await adapter.search('test', { + allowedDomains: ['allowed.com'], + }) + + expect(results).toHaveLength(1) + expect(results[0].url).toBe('https://allowed.com/a') + }) + + test('filters results by blockedDomains', async () => { + const mixedResponse = { + grounding: { + generic: [ + { title: 'Good', url: 'https://good.com/a' }, + { title: 'Spam', url: 'https://spam.com/b' }, + ], + }, + } + + mock.module('axios', () => ({ + default: { + get: mock(() => Promise.resolve({ data: mixedResponse })), + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + const results = await adapter.search('test', { + blockedDomains: ['spam.com'], + }) + + expect(results).toHaveLength(1) + expect(results[0].url).toBe('https://good.com/a') + }) + + test('filters subdomains with allowedDomains', async () => { + const response = { + grounding: { + generic: [ + { title: 'Subdomain', url: 'https://docs.example.com/page' }, + { title: 'Other', url: 'https://other.com/page' }, + ], + }, + } + + mock.module('axios', () => ({ + default: { + get: mock(() => Promise.resolve({ data: response })), + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + const results = await adapter.search('test', { + allowedDomains: ['example.com'], + }) + + expect(results).toHaveLength(1) + expect(results[0].url).toBe('https://docs.example.com/page') + }) + + test('throws AbortError when signal is already aborted', async () => { + mock.module('axios', () => ({ + default: { + get: mock((_url: string, config: any) => { + if (config?.signal?.aborted) { + const err = new Error('canceled') + ;(err as any).__CANCEL__ = true + return Promise.reject(err) + } + return Promise.resolve({ data: SAMPLE_RESPONSE }) + }), + isCancel: (e: any) => e?.__CANCEL__ === true, + }, + })) + + const adapter = await createAdapter() + const controller = new AbortController() + controller.abort() + + const { AbortError } = await import('../../../utils/errors') + await expect( + adapter.search('test', { signal: controller.signal }), + ).rejects.toThrow(AbortError) + }) + + test('re-throws non-abort axios errors', async () => { + const networkError = new Error('Network error') + mock.module('axios', () => ({ + default: { + get: mock(() => Promise.reject(networkError)), + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + await expect(adapter.search('test', {})).rejects.toThrow('Network error') + }) + + test('sends the documented HTTPS endpoint with query params and auth header', async () => { + const axiosGet = mock(() => Promise.resolve({ data: SAMPLE_RESPONSE })) + mock.module('axios', () => ({ + default: { + get: axiosGet, + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + await adapter.search('hello world & special=chars', {}) + + expect(axiosGet.mock.calls).toHaveLength(1) + expect((axiosGet.mock.calls as any[][])[0][0]).toBe( + 'https://api.search.brave.com/res/v1/llm/context', + ) + expect((axiosGet.mock.calls as any[][])[0][1]).toMatchObject({ + params: { q: 'hello world & special=chars' }, + headers: { + Accept: 'application/json', + 'X-Subscription-Token': 'test-brave-key', + }, + }) + }) + + test('accepts BRAVE_API_KEY as a fallback env var', async () => { + delete process.env.BRAVE_SEARCH_API_KEY + process.env.BRAVE_API_KEY = 'fallback-key' + + const axiosGet = mock(() => Promise.resolve({ data: SAMPLE_RESPONSE })) + mock.module('axios', () => ({ + default: { + get: axiosGet, + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + await adapter.search('test', {}) + + expect((axiosGet.mock.calls as any[][])[0][1].headers).toMatchObject({ + 'X-Subscription-Token': 'fallback-key', + }) + }) + + test('throws when no Brave API key is configured', async () => { + delete process.env.BRAVE_SEARCH_API_KEY + delete process.env.BRAVE_API_KEY + + mock.module('axios', () => ({ + default: { + get: mock(() => Promise.resolve({ data: SAMPLE_RESPONSE })), + isCancel: () => false, + }, + })) + + const adapter = await createAdapter() + await expect(adapter.search('test', {})).rejects.toThrow( + 'BraveSearchAdapter requires BRAVE_SEARCH_API_KEY or BRAVE_API_KEY', + ) + }) +}) diff --git a/src/tools/WebSearchTool/adapters/braveAdapter.ts b/src/tools/WebSearchTool/adapters/braveAdapter.ts new file mode 100644 index 000000000..fbfc6e7da --- /dev/null +++ b/src/tools/WebSearchTool/adapters/braveAdapter.ts @@ -0,0 +1,169 @@ +/** + * Brave-based search adapter — fetches Brave's LLM context API and maps the + * grounding payload into SearchResult objects. + */ + +import axios from 'axios' +import { AbortError } from '../../../utils/errors.js' +import type { SearchResult, SearchOptions, WebSearchAdapter } from './types.js' + +const FETCH_TIMEOUT_MS = 30_000 +const BRAVE_LLM_CONTEXT_URL = 'https://api.search.brave.com/res/v1/llm/context' +const BRAVE_API_KEY_ENV_VARS = ['BRAVE_SEARCH_API_KEY', 'BRAVE_API_KEY'] as const + +interface BraveGroundingResult { + title?: string + url?: string + snippets?: string[] +} + +interface BraveSearchResponse { + grounding?: { + generic?: BraveGroundingResult[] + map?: BraveGroundingResult[] + poi?: BraveGroundingResult | null + } +} + +export class BraveSearchAdapter implements WebSearchAdapter { + async search( + query: string, + options: SearchOptions, + ): Promise { + const { signal, onProgress, allowedDomains, blockedDomains } = options + + if (signal?.aborted) { + throw new AbortError() + } + + onProgress?.({ type: 'query_update', query }) + + const abortController = new AbortController() + if (signal) { + signal.addEventListener('abort', () => abortController.abort(), { + once: true, + }) + } + + let payload: BraveSearchResponse + try { + const response = await axios.get( + BRAVE_LLM_CONTEXT_URL, + { + signal: abortController.signal, + timeout: FETCH_TIMEOUT_MS, + responseType: 'json', + headers: { + Accept: 'application/json', + 'X-Subscription-Token': getBraveApiKey(), + }, + params: { q: query }, + }, + ) + payload = response.data + } catch (e) { + if (axios.isCancel(e) || abortController.signal.aborted) { + throw new AbortError() + } + throw e + } + + if (abortController.signal.aborted) { + throw new AbortError() + } + + const rawResults = extractBraveResults(payload) + const results = rawResults.filter(r => { + try { + const hostname = new URL(r.url).hostname + if ( + allowedDomains?.length && + !allowedDomains.some( + d => hostname === d || hostname.endsWith('.' + d), + ) + ) { + return false + } + if ( + blockedDomains?.length && + blockedDomains.some(d => hostname === d || hostname.endsWith('.' + d)) + ) { + return false + } + } catch { + return false + } + return true + }) + + onProgress?.({ + type: 'search_results_received', + resultCount: results.length, + query, + }) + + return results + } +} + +export function extractBraveResults( + payload: BraveSearchResponse, +): SearchResult[] { + const grounding = payload.grounding + if (!grounding) { + return [] + } + + const entries = [ + ...(Array.isArray(grounding.generic) ? grounding.generic : []), + ...(grounding.poi ? [grounding.poi] : []), + ...(Array.isArray(grounding.map) ? grounding.map : []), + ] + + const seenUrls = new Set() + const results: SearchResult[] = [] + + for (const entry of entries) { + if (!entry?.url || !entry.title || seenUrls.has(entry.url)) { + continue + } + + seenUrls.add(entry.url) + results.push({ + title: entry.title, + url: entry.url, + snippet: normalizeSnippet(entry.snippets), + }) + } + + return results +} + +function normalizeSnippet(snippets: string[] | undefined): string | undefined { + if (!Array.isArray(snippets)) { + return undefined + } + + const normalized = snippets + .map(snippet => snippet.trim()) + .filter(snippet => snippet.length > 0) + + if (normalized.length === 0) { + return undefined + } + + return normalized.join(' ') +} + +function getBraveApiKey(): string { + for (const envVar of BRAVE_API_KEY_ENV_VARS) { + const value = process.env[envVar]?.trim() + if (value) { + return value + } + } + + throw new Error( + 'BraveSearchAdapter requires BRAVE_SEARCH_API_KEY or BRAVE_API_KEY', + ) +} diff --git a/src/tools/WebSearchTool/adapters/index.ts b/src/tools/WebSearchTool/adapters/index.ts index 49bf07ed9..16c5b6c50 100644 --- a/src/tools/WebSearchTool/adapters/index.ts +++ b/src/tools/WebSearchTool/adapters/index.ts @@ -6,36 +6,42 @@ import { isFirstPartyAnthropicBaseUrl } from '../../../utils/model/providers.js' import { ApiSearchAdapter } from './apiAdapter.js' import { BingSearchAdapter } from './bingAdapter.js' +import { BraveSearchAdapter } from './braveAdapter.js' import type { WebSearchAdapter } from './types.js' -export type { SearchResult, SearchOptions, SearchProgress, WebSearchAdapter } from './types.js' +export type { + SearchResult, + SearchOptions, + SearchProgress, + WebSearchAdapter, +} from './types.js' let cachedAdapter: WebSearchAdapter | null = null +let cachedAdapterKey: 'api' | 'bing' | 'brave' | null = null export function createAdapter(): WebSearchAdapter { - // 直接用 bing 适配器,跳过 API 适配器的选择逻辑 - return new BingSearchAdapter() -// // Adapter is stateless — safe to reuse across calls within a session -// if (cachedAdapter) return cachedAdapter + const envAdapter = process.env.WEB_SEARCH_ADAPTER + const adapterKey = + envAdapter === 'api' || envAdapter === 'bing' || envAdapter === 'brave' + ? envAdapter + : isFirstPartyAnthropicBaseUrl() + ? 'api' + : 'bing' -// // Env override: WEB_SEARCH_ADAPTER=api|bing forces specific backend -// const envAdapter = process.env.WEB_SEARCH_ADAPTER -// if (envAdapter === 'api') { -// cachedAdapter = new ApiSearchAdapter() -// return cachedAdapter -// } -// if (envAdapter === 'bing') { -// cachedAdapter = new BingSearchAdapter() -// return cachedAdapter -// } + if (cachedAdapter && cachedAdapterKey === adapterKey) return cachedAdapter -// // Anthropic official URL → API server-side search -// if (isFirstPartyAnthropicBaseUrl()) { -// cachedAdapter = new ApiSearchAdapter() -// return cachedAdapter -// } + if (adapterKey === 'api') { + cachedAdapter = new ApiSearchAdapter() + cachedAdapterKey = 'api' + return cachedAdapter + } + if (adapterKey === 'bing') { + cachedAdapter = new BingSearchAdapter() + cachedAdapterKey = 'bing' + return cachedAdapter + } -// // Third-party proxies / non-Anthropic endpoints → Bing fallback -// cachedAdapter = new BingSearchAdapter() -// return cachedAdapter + cachedAdapter = new BraveSearchAdapter() + cachedAdapterKey = 'brave' + return cachedAdapter }