Compared with `left_product` function, attention mask is not used in `forward()` function. How to use the attention mask in the forward method?
Compared with
left_productfunction, attention mask is not used inforward()function.How to use the attention mask in the forward method?