Skip to content

[feat]支持safetensor meat head 文件加载#1013

Open
wangshankun wants to merge 1 commit intomainfrom
dev/seko_dummy
Open

[feat]支持safetensor meat head 文件加载#1013
wangshankun wants to merge 1 commit intomainfrom
dev/seko_dummy

Conversation

@wangshankun
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to export and use lightweight 'dummy-meta' safetensors files, which contain only tensor metadata (shapes and dtypes) without the actual weight data. This is useful for model initialization or configuration checks without loading full model weights. The changes include a new export tool tools/convert/export_dummy_meta.py, updates to base_model.py to handle these dummy files, and documentation. Feedback focuses on improving robustness by adding explicit error handling for missing metadata keys, validating file header sizes to prevent crashes on corrupted files, and correcting a typo in the documentation.

Comment on lines +160 to +161
if metadata.get("_is_dummy_meta") == "true" and "_tensor_meta" in metadata:
return json.loads(metadata["_tensor_meta"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

如果 _is_dummy_meta"true"_tensor_meta 缺失,当前逻辑会跳过处理并返回空字典。这可能导致后续模型初始化时因缺少权重而报错,且错误信息难以定位。建议在此处增加明确的错误检查。

Suggested change
if metadata.get("_is_dummy_meta") == "true" and "_tensor_meta" in metadata:
return json.loads(metadata["_tensor_meta"])
if metadata.get("_is_dummy_meta") == "true":
if "_tensor_meta" not in metadata:
raise ValueError(f"Dummy meta file {file_path} is missing '_tensor_meta' in __metadata__")
return json.loads(metadata["_tensor_meta"])

def read_tensor_metadata(file_path: str) -> dict:
"""Read tensor metadata from a full safetensors file header."""
with open(file_path, "rb") as f:
header_size = struct.unpack("<Q", f.read(8))[0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

在读取 8 字节的文件头大小之前,建议检查读取到的字节长度。如果文件损坏或过小,直接解包会抛出 struct.error。增加长度检查可以提供更友好的错误提示。

Suggested change
header_size = struct.unpack("<Q", f.read(8))[0]
header_bytes_8 = f.read(8)
if len(header_bytes_8) < 8:
raise ValueError(f"Invalid safetensors file (too small): {file_path}")
header_size = struct.unpack("<Q", header_bytes_8)[0]

for key, info in header.items():
if key == "__metadata__":
continue
tensor_meta[key] = {"shape": info["shape"], "dtype": info["dtype"]}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

直接访问 info["shape"]info["dtype"] 在处理非标准或已损坏的 safetensors 文件时可能会抛出 KeyError。建议增加键值存在性检查。

Suggested change
tensor_meta[key] = {"shape": info["shape"], "dtype": info["dtype"]}
if "shape" in info and "dtype" in info:
tensor_meta[key] = {"shape": info["shape"], "dtype": info["dtype"]}



### 5. dit权重头导出
#### 5.1 safetensors meta → dump_\txt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

标题中的 dump_\txt 描述不准确,且包含转义字符。该工具实际生成的是轻量级的 .safetensors 权重头文件,而非文本文件。建议修改为更清晰的描述。

Suggested change
#### 5.1 safetensors meta → dump_\txt
#### 5.1 safetensors meta → dummy safetensors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant