Skip to content

Can't run glm-4-9b-chat on cuda 12 #18

@alabulei1

Description

@alabulei1

When I run glm-4-9b-chat-Q5_K_M.gguf on the Cuda 12 machine, the API server can be started successfully. However, when I send a question, the API server will crash.

The command I used to start the API server is as follows:

wasmedge --dir .:. --nn-preload default:GGML:AUTO:glm-4-9b-chat-Q5_K_M.gguf \
  llama-api-server.wasm \
  --prompt-template glm-4-chat \
  --ctx-size 4096 \
  --model-name glm-4-9b-chat

Here is the error message

[2024-07-11 07:50:12.036] [wasi_logging_stdout] [info] llama_core: llama_core::chat in llama-core/src/chat.rs:1315: Get the model metadata.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] llama_core: llama_core::chat in llama-core/src/chat.rs:1349: The model `internlm2_5-7b-chat` does not exist in the chat graphs.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] chat_completions_handler: llama_api_server::backend::ggml in llama-api-server/src/backend/ggml.rs:392: Failed to get chat completions. Reason: The model `internlm2_5-7b-chat` does not exist in the chat graphs.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] response: llama_api_server::error in llama-api-server/src/error.rs:25: 500 Internal Server Error: Failed to get chat completions. Reason: The model `internlm2_5-7b-chat` does not exist in the chat graphs.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [info] chat_completions_handler: llama_api_server::backend::ggml in llama-api-server/src/backend/ggml.rs:399: Send the chat completion response.
[2024-07-11 07:50:12.036] [wasi_logging_stdout] [error] response: llama_api_server in llama-api-server/src/main.rs:518: version: HTTP/1.1, body_size: 133, status: 500, is_informational: false, is_success: false, is_redirection: false, is_client_error: false, is_server_error: true

Versions

[2024-07-11 07:47:51.368] [wasi_logging_stdout] [info] server_config: llama_api_server in llama-api-server/src/main.rs:131: server version: 0.12.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions