Skip to content

添加Attention计算层 #29

Description

@peterlau123

1.集成flash attention
CPU
CUDA flash attention inference
Metal
2.添加单元测试

Metadata

Metadata

Assignees

Labels

computecompute related componentsenhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions