Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there an efficient way to use memory_efficient_attention with a causal mask that has a small rectangle of zeros? #1131

Open
arilato opened this issue Oct 22, 2024 · 0 comments

Comments

@arilato
Copy link

arilato commented Oct 22, 2024

❓ Questions and Help

I understand I can pass a custom mask to memory_efficient_attention, but it is very inefficient for what I'm trying to do. Essentially, I'm adding a small rectangle of zeros (or -inf) to the attention mask near the lower right diagonal. Essentially, I want to formulate a sequence
(context, m1, m2)
s.t. m2 cannot attend to m1, each being a series of tokens.

Is there a memory-efficient way to do this in xformers without materializing the entire mask?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant