Is there an efficient way to use memory_efficient_attention with a causal mask that has a small rectangle of zeros? #1131

arilato · 2024-10-22T00:22:11Z

❓ Questions and Help

I understand I can pass a custom mask to memory_efficient_attention, but it is very inefficient for what I'm trying to do. Essentially, I'm adding a small rectangle of zeros (or -inf) to the attention mask near the lower right diagonal. Essentially, I want to formulate a sequence
(context, m1, m2)
s.t. m2 cannot attend to m1, each being a series of tokens.

Is there a memory-efficient way to do this in xformers without materializing the entire mask?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there an efficient way to use memory_efficient_attention with a causal mask that has a small rectangle of zeros? #1131

Is there an efficient way to use memory_efficient_attention with a causal mask that has a small rectangle of zeros? #1131

arilato commented Oct 22, 2024

Is there an efficient way to use memory_efficient_attention with a causal mask that has a small rectangle of zeros? #1131

Is there an efficient way to use memory_efficient_attention with a causal mask that has a small rectangle of zeros? #1131

Comments

arilato commented Oct 22, 2024

❓ Questions and Help