Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Update Ragged Offsets to 64 Bit
#1230 opened Oct 8, 2024 by mgoldfarb-nvidia Loading…
8 of 13 tasks
[Pytorch] Check gradient in test numerics
#1229 opened Oct 8, 2024 by pggPL Loading…
7 of 13 tasks
[TE/JAX] Enabling CudaGraph for custom calls with FFI jax
#1228 opened Oct 7, 2024 by phu0ngng Loading…
4 of 13 tasks
[PyTorch] Drop FA2 as an installation requirement
#1226 opened Oct 7, 2024 by cyanguwa Loading…
8 of 13 tasks
Small fixes to Float8Tensor
#1225 opened Oct 4, 2024 by ptrendx Loading…
3 of 13 tasks
[PyTorch] Add documentation for FP8 attention checkpointing
#1223 opened Oct 2, 2024 by cyanguwa Loading…
8 of 13 tasks
Create README.md for examples/
#1221 opened Oct 1, 2024 by sbhavani Loading…
5 tasks
Fix bug in torch compile and seqdim is integer
#1217 opened Sep 29, 2024 by wplf Loading…
7 of 13 tasks
[PyTorch] Improve get_attention_backend
#1214 opened Sep 27, 2024 by cyanguwa Loading…
8 of 13 tasks
[PyTorch] Improve CP P2P efficiency
#1208 opened Sep 26, 2024 by yenchenlin Loading…
1 of 6 tasks
[JAX] Expose sliding window attn to TE-JAX API enhancement New feature or request jax
#1205 opened Sep 25, 2024 by huanghua1994 Loading…
8 of 13 tasks
[PyTorch] Debug dtype casting in operation-based API bug Something isn't working
#1202 opened Sep 24, 2024 by timmoon10 Loading…
7 of 13 tasks
Draft: Use fused push_send_recv kernel for TP AG and RS overlaps
#1200 opened Sep 24, 2024 by erhoo82 Loading…
13 tasks
[PyTorch] Fused dbias-cast-transpose in bias operation
#1168 opened Sep 6, 2024 by timmoon10 Loading…
7 of 13 tasks
Fix autocast deprecation warning.
#1167 opened Sep 6, 2024 by jondeaton Loading…
[PyTorch] Activation operations
#1164 opened Sep 6, 2024 by timmoon10 Loading…
6 of 13 tasks
[PyTorch] Avoid saving fp8_tensors in certain scenarios
#1143 opened Aug 28, 2024 by cyanguwa Loading…
8 of 13 tasks
[PyTorch] Userbuffers support in operation-based API
#1142 opened Aug 27, 2024 by timmoon10 Loading…
7 of 13 tasks
Norms Refractor
#1140 opened Aug 27, 2024 by phu0ngng Draft
5 of 13 tasks
Don't save fp8 q/k/v/out tensors when using bf16 bprop
#1139 opened Aug 27, 2024 by guyueh1 Loading…
13 tasks
Fix param input order for cudagraph bug Something isn't working
#1138 opened Aug 27, 2024 by yifeis-nv Loading…
4 of 13 tasks
Add high_precision_init_val to model params when using fp8_model_init
#1121 opened Aug 19, 2024 by kunlunl Loading…
8 of 13 tasks
ProTip! no:milestone will show everything without a milestone.