wangbluo
|
fd92789af2
|
fix
|
2024-10-15 13:26:44 +08:00 |
wangbluo
|
6be9862aaf
|
fix
|
2024-10-15 11:56:49 +08:00 |
wangbluo
|
3dc08c8a5a
|
fix
|
2024-10-15 11:01:34 +08:00 |
wangbluo
|
8ff7d0c780
|
fix
|
2024-10-14 18:16:03 +08:00 |
wangbluo
|
3201377e94
|
fix
|
2024-10-14 18:06:24 +08:00 |
wangbluo
|
23199e34cc
|
fix
|
2024-10-14 18:01:53 +08:00 |
wangbluo
|
703bb5c18d
|
fix the test
|
2024-10-11 17:34:20 +08:00 |
wangbluo
|
4e0e99bb6a
|
fix the test
|
2024-10-11 17:31:40 +08:00 |
Edenzzzz
|
f5c84af0b0
|
[Feature] Zigzag Ring attention (#5905)
* halfway
* fix cross-PP-stage position id length diff bug
* fix typo
* fix typo
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* unified cross entropy func for all shardformer models
* remove redundant lines
* add basic ring attn; debug cross entropy
* fwd bwd logic complete
* fwd bwd logic complete; add experimental triton rescale
* precision tests passed
* precision tests passed
* fix typos and remove misc files
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add sp_mode to benchmark; fix varlen interface
* update softmax_lse shape by new interface
* change tester name
* remove buffer clone; support packed seq layout
* add varlen tests
* fix typo
* all tests passed
* add dkv_group; fix mask
* remove debug statements
---------
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2024-08-16 13:56:38 +08:00 |