From 6ababeb7dba9277ce6a4819e97da28715daee274 Mon Sep 17 00:00:00 2001 From: Philip Turner Date: Sat, 15 Jul 2023 01:34:24 -0400 Subject: [PATCH 1/3] Update usage.md --- usage.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/usage.md b/usage.md index d4a019a44..8986399b7 100644 --- a/usage.md +++ b/usage.md @@ -123,3 +123,5 @@ yields the fastest BERT training on cloud instances in MLPerf training 2.0 (June - [Jax](https://github.com/google/jax): an [implementation](https://github.com/lucidrains/flash-attention-jax) in Jax by [lucidrains](https://github.com/lucidrains/). + +- [Metal](https://developer.apple.com/metal/): an [implementation](https://github.com/philipturner/metal-flash-attention) by Philip Turner. This ports FlashAttention to mobile GPU architectures such as Apple silicon. From 905c13a2d9a845e9ac8bd597e30a19270b058382 Mon Sep 17 00:00:00 2001 From: Philip Turner Date: Sat, 15 Jul 2023 01:55:43 -0400 Subject: [PATCH 2/3] Update usage.md --- usage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/usage.md b/usage.md index 8986399b7..b30b748e6 100644 --- a/usage.md +++ b/usage.md @@ -124,4 +124,4 @@ yields the fastest BERT training on cloud instances in MLPerf training 2.0 (June - [Jax](https://github.com/google/jax): an [implementation](https://github.com/lucidrains/flash-attention-jax) in Jax by [lucidrains](https://github.com/lucidrains/). -- [Metal](https://developer.apple.com/metal/): an [implementation](https://github.com/philipturner/metal-flash-attention) by Philip Turner. This ports FlashAttention to mobile GPU architectures such as Apple silicon. +- [Metal](https://developer.apple.com/metal/): an [implementation](https://github.com/philipturner/metal-flash-attention) in Metal by Philip Turner. This ports FlashAttention to mobile GPU architectures such as Apple silicon. From 4dbcaa144378c1461f0e64420c59fcdd1cd43409 Mon Sep 17 00:00:00 2001 From: Philip Turner Date: Sat, 15 Jul 2023 08:40:46 -0400 Subject: [PATCH 3/3] Update usage.md --- usage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/usage.md b/usage.md index b30b748e6..133bfbdb6 100644 --- a/usage.md +++ b/usage.md @@ -124,4 +124,4 @@ yields the fastest BERT training on cloud instances in MLPerf training 2.0 (June - [Jax](https://github.com/google/jax): an [implementation](https://github.com/lucidrains/flash-attention-jax) in Jax by [lucidrains](https://github.com/lucidrains/). -- [Metal](https://developer.apple.com/metal/): an [implementation](https://github.com/philipturner/metal-flash-attention) in Metal by Philip Turner. This ports FlashAttention to mobile GPU architectures such as Apple silicon. +- [Metal](https://developer.apple.com/metal): an [implementation](https://github.com/philipturner/metal-flash-attention) in Metal by Philip Turner. This ports FlashAttention to mobile GPU architectures such as Apple silicon.