Skip to content

Question about L2 cache swizzling in FP8 GEMM #1

@luongthecong123

Description

@luongthecong123

Thank you for sharing your amazing solutions. Could you give more information on the logic of L2 cache swizzling used in your kernel ?
Is this similar to the "Scheduling and L2 cache" in Kernel 6 of this blog, where the author improved L2 cache hit:

https://cudaforfun.substack.com/p/outperforming-cublas-on-h100-a-worklog

Thanks,
Cong

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions