Skip to content

DeepWiki生成Wiki时调用了嵌入模型 (nomic-embed-text) 而不是生成模型 (qwen3:4b) #511

@Ranlicness

Description

@Ranlicness

问题描述

在使用 DeepWiki 分析本地 C 代码项目时,Wiki 生成失败,页面持续显示生成中。

Image 最后报错 Image

预期行为

应该成功生成项目的 Wiki 结构文档。

实际行为

DeepWiki 尝试调用 nomic-embed-text 模型来生成 Wiki 结构,导致超时失败。

相关日志

docker logs deepwiki 中获取的关键日志如下:


Using HTTP fallback for chat completion instead of WebSockets
2026-04-21 07:45:41,387 - INFO - api.simple_chat - simple_chat.py:86 - Request size: 1305 tokens
2026-04-21 07:45:41,390 - INFO - api.ollama_patch - ollama_patch.py:48 - Ollama model 'nomic-embed-text' is available
2026-04-21 07:45:41,406 - INFO - adalflow.core.prompt_builder - prompt_builder.py:74 - Prompt has variables: ['schema']
2026-04-21 07:45:41,407 - INFO - adalflow.core.prompt_builder - prompt_builder.py:74 - Prompt has variables: ['schema']
2026-04-21 07:45:41,407 - INFO - api.rag - rag.py:74 - Dialog turns list exists but is empty
2026-04-21 07:45:41,407 - INFO - api.rag - rag.py:88 - Returning 0 dialog turns from memory
2026-04-21 07:45:41,424 - INFO - adalflow.optim.grad_component - grad_component.py:79 - EvalFnToTextLoss: No backward engine provided. Creating one using model_client and model_kwargs.
2026-04-21 07:45:41,424 - INFO - adalflow.core.generator - generator.py:188 - Generator Generator initialized.
2026-04-21 07:45:41,424 - INFO - api.data_pipeline - data_pipeline.py:789 - Preparing repo storage for /data/code/src...
2026-04-21 07:45:41,424 - INFO - api.data_pipeline - data_pipeline.py:825 - Repo paths: {'save_repo_dir': '/data/code/src', 'save_db_file': '/root/.adalflow/databases/src.pkl'}
2026-04-21 07:45:41,424 - INFO - api.data_pipeline - data_pipeline.py:870 - Loading existing database...
2026-04-21 07:45:41,438 - INFO - adalflow.core.component - component.py:335 - Restoring class using from_dict TextSplitter, {'type': 'TextSplitter', 'data': {'_components': {'_ordered_dict': True, 'data': []}, '_parameters': {'_ordered_dict': True, 'data': []}, 'training': False, 'teacher_mode': False, 'tracing': False, 'name': 'TextSplitter', '_init_args': {'split_by': 'word', 'chunk_size': 800, 'chunk_overlap': 200, 'batch_size': 1000, 'separators': {'page': '\x0c', 'passage': '\n\n', 'word': ' ', 'sentence': '.', 'token': ''}}, 'split_by': 'word', 'separators': {'page': '\x0c', 'passage': '\n\n', 'word': ' ', 'sentence': '.', 'token': ''}, 'chunk_size': 350, 'chunk_overlap': 100, 'batch_size': 1000}}
2026-04-21 07:45:41,439 - ERROR - adalflow.core.component - component.py:345 - Unknown class type: OllamaDocumentProcessor
2026-04-21 07:45:41,439 - INFO - api.data_pipeline - data_pipeline.py:879 - Loaded 425 documents from existing database (embeddings: 425 non-empty, 0 empty; sample_dims=[768])
2026-04-21 07:45:41,439 - INFO - api.rag - rag.py:372 - Loaded 425 documents for retrieval
2026-04-21 07:45:41,440 - INFO - api.rag - rag.py:301 - Target embedding size: 768 (found in 425 documents)
2026-04-21 07:45:41,440 - INFO - api.rag - rag.py:335 - Embedding validation complete: 425/425 documents have valid embeddings
2026-04-21 07:45:41,440 - INFO - api.rag - rag.py:380 - Using 425 documents with valid embeddings for retrieval
2026-04-21 07:45:41,440 - INFO - adalflow.optim.grad_component - grad_component.py:79 - EvalFnToTextLoss: No backward engine provided. Creating one using model_client and model_kwargs.
2026-04-21 07:45:41,449 - WARNING - adalflow.components.retriever.faiss_retriever - faiss_retriever.py:191 - Embeddings are not normalized, normalizing the embeddings
2026-04-21 07:45:41,450 - INFO - adalflow.components.retriever.faiss_retriever - faiss_retriever.py:197 - Index built with 425 chunks
2026-04-21 07:45:41,450 - INFO - api.rag - rag.py:391 - FAISS retriever created successfully
2026-04-21 07:45:41,450 - INFO - api.simple_chat - simple_chat.py:115 - Retriever prepared for /data/code/src
2026-04-21 07:45:41,451 - INFO - adalflow.components.model_client.ollama_client - ollama_client.py:423 - api_kwargs: {'model': 'nomic-embed-text', 'prompt': 'Analyze this GitHub repository local/src and create a wiki structure for it.\n\n1. The complete file tree of the project:\n<file_tree>\nTM4C_eeprom.c\nTM4C_eeprom.h\nTM4C_gpio.c\nTM4C_gpio.h\nTM4C_i2c.c\nTM4C_i2c.h\nTM4C_net.c\nTM4C_net.h\nTM4C_spi.c\nTM4C_spi.h\nTM4C_time.c\nTM4C_time.h\nTM4C_uart.c\nTM4C_uart.h\nbasetype.h\ncli.c\ncli.h\ncommand.c\ncommand.h\ngearbox.c\ngearbox.h\ninphi_config.h\ninphi_rtos.c\ninphi_rtos.h\ninphi_types.h\nled.c\nled.h\nlog.c\nlog.h\nmain.c\nmain.h\nmdio.c\nmdio.h\npla_cfp2_fun_acacia.c\npla_cfp2_fun_acacia.h\npla_cfp2_fun_fiberhome.c\npla_cfp2_fun_fiberhome.h\npla_cfp2_fun_fiberhome_inphi.c\npla_cfp2_fun_fiberhome_inphi.h\npla_cfp2_fun_fiberhome_luyu.c\npla_cfp2_fun_fiberhome_luyu.h\npla_cfp2_fun_hisilicon.c\npla_cfp2_fun_hisilicon.h\npla_cfp2_fun_innolight.c\npla_cfp2_fun_innolight.h\npla_cfp2_fun_innolight_inphi.c\npla_cfp2_fun_innolight_inphi.h\npla_cfp4_fun_acacia.c\npla_cfp4_fun_acacia.h\npla_cfp4_fun_fiberhome_inphi.c\npla_cfp4_fun_fiberhome_inphi.h\npla_cfp4_strengthen_fun_acacia.c\npla_cfp_card.c\npla_cfp_card.h\npla_cfp_fun.c\npla_cfp_fun.h\npla_cfp_init.c\npla_cfp_init.h\npla_cfp_module.c\npla_cfp_module.h\npower_info.c\npower_info.h\nqsfp.c\nqsfp.h\nqsfp_oam.c\nqsfp_oam.h\nrtc.c\nrtc.h\nsdd_card.c\nsdd_card.h\nsdd_init.c\nsdd_init.h\nsdd_t4dh.c\nsdd_t4dh.h\ntelnet.c\ntelnet.h\ntest.c\ntest.h\nudp_api.c\nudp_api.h\nutl_string.c\nutl_string.h\nutl_vt.c\nutl_vt.h\nvega_api.c\nvega_api.h\nvega_registers.h\nvega_rules.h\n</file_tree>\n\n2. The README file of the project:\n<readme>\n\n</readme>\n\nI want to create a wiki for this repository. Determine the most logical structure for a wiki based on the repository\'s content.\n\nIMPORTANT: The wiki content will be generated in Mandarin Chinese (中文) language.\n\nWhen designing the wiki structure, include pages that would benefit from visual diagrams, such as:\n- Architecture overviews\n- Data flow descriptions\n- Component relationships\n- Process workflows\n- State machines\n- Class hierarchies\n\n\nCreate a structured wiki with the following main sections:\n- Overview (general information about the project)\n- System Architecture (how the system is designed)\n- Core Features (key functionality)\n- Data Management/Flow: If applicable, how data is stored, processed, accessed, and managed (e.g., database schema, data pipelines, state management).\n- Frontend Components (UI elements, if applicable.)\n- Backend Systems (server-side components)\n- Model Integration (AI model connections)\n- Deployment/Infrastructure (how to deploy, what\'s the infrastructure like)\n- Extensibility and Customization: If the project architecture supports it, explain how to extend or customize its functionality (e.g., plugins, theming, custom modules, hooks).\n\nEach section should contain relevant pages. For example, the "Frontend Components" section might include pages for "Home Page", "Repository Wiki Page", "Ask Component", etc.\n\nReturn your analysis in the following XML format:\n\n<wiki_structure>\n  <title>[Overall title for the wiki]</title>\n  <description>[Brief description of the repository]</description>\n  <sections>\n    <section id="section-1">\n      <title>[Section title]</title>\n      <pages>\n        <page_ref>page-1</page_ref>\n        <page_ref>page-2</page_ref>\n      </pages>\n      <subsections>\n        <section_ref>section-2</section_ref>\n      </subsections>\n    </section>\n    <!-- More sections as needed -->\n  </sections>\n  <pages>\n    <page id="page-1">\n      <title>[Page title]</title>\n      <description>[Brief description of what this page will cover]</description>\n      <importance>high|medium|low</importance>\n      <relevant_files>\n        <file_path>[Path to a relevant file]</file_path>\n        <!-- More file paths as needed -->\n      </relevant_files>\n      <related_pages>\n        <related>page-2</related>\n        <!-- More related page IDs as needed -->\n      </related_pages>\n      <parent_section>section-1</parent_section>\n    </page>\n    <!-- More pages as needed -->\n  </pages>\n</wiki_structure>\n\n\nIMPORTANT FORMATTING INSTRUCTIONS:\n- Return ONLY the valid XML structure specified above\n- DO NOT wrap the XML in markdown code blocks (no ``` or ```xml)\n- DO NOT include any explanation text before or after the XML\n- Ensure the XML is properly formatted and valid\n- Start directly with <wiki_structure> and end with </wiki_structure>\n\nIMPORTANT:\n1. Create 8-12 pages that would make a comprehensive wiki for this repository\n2. Each page should focus on a specific aspect of the codebase (e.g., architecture, key features, setup)\n3. The relevant_files should be actual files from the repository that would be used to generate that page\n4. Return ONLY valid XML with the structure specified above, with no markdown code block delimiters'}
2026-04-21 07:45:45,736 - INFO - httpx - _client.py:1025 - HTTP Request: POST http://192.168.27.21:11434/api/embeddings "HTTP/1.1 200 OK"
2026-04-21 07:45:45,738 - INFO - api.simple_chat - simple_chat.py:208 - Retrieved 20 documents
2026-04-21 07:45:45,738 - INFO - api.rag - rag.py:74 - Dialog turns list exists but is empty
2026-04-21 07:45:45,739 - INFO - api.rag - rag.py:88 - Returning 0 dialog turns from memory
INFO:     127.0.0.1:45056 - "POST /chat/completions/stream HTTP/1.1" 200 OK

其中: api_kwargs: {'model': 'nomic-embed-text', 'prompt': 'Analyze this GitHub repository...create a wiki structure...'} 显示其使用nomic-embed-text来进行Wiki生成动作,会一直卡死在这里,无法生成wiki文档;

环境信息

  • DeepWiki 版本: ghcr.io/asyncfuncai/deepwiki-open:latest (拉取于 2026-04-21)
  • Ollama 版本: 0.21.0
  • 使用模型: 生成模型配置为 qwen3:4b,嵌入模型为 nomic-embed-text
  • 部署方式: 使用 docker run 命令部署

配置文件

generator.json 的内容如下:

{
  "default_provider": "ollama",
  "providers": {
    "ollama": {
      "default_model": "qwen3:4b",
      ...
    }
  }
}

额外信息

  • 向量化处理 (nomic-embed-text) 成功完成(处理了 425 个文档)。
  • 怀疑是 websocket_wiki.py 中硬编码或错误地使用了嵌入模型来执行生成任务。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions