Edenzzzz
|
f5c84af0b0
|
[Feature] Zigzag Ring attention (#5905)
* halfway
* fix cross-PP-stage position id length diff bug
* fix typo
* fix typo
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* unified cross entropy func for all shardformer models
* remove redundant lines
* add basic ring attn; debug cross entropy
* fwd bwd logic complete
* fwd bwd logic complete; add experimental triton rescale
* precision tests passed
* precision tests passed
* fix typos and remove misc files
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add sp_mode to benchmark; fix varlen interface
* update softmax_lse shape by new interface
* change tester name
* remove buffer clone; support packed seq layout
* add varlen tests
* fix typo
* all tests passed
* add dkv_group; fix mask
* remove debug statements
---------
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
2024-08-16 13:56:38 +08:00 |
Guangyao Zhang
|
669849d74b
|
[ShardFormer] Add Ulysses Sequence Parallelism support for Command-R, Qwen2 and ChatGLM (#5897)
|
2024-07-10 11:34:25 +08:00 |
GuangyaoZhang
|
d84d68601a
|
change 'xxx if xxx else None' to 'xxx or None'
|
2024-06-18 03:32:42 +00:00 |
GuangyaoZhang
|
a83a2336e8
|
rebase master llama change
|
2024-06-18 02:56:47 +00:00 |
GuangyaoZhang
|
363cde6957
|
merge model and attention forward
|
2024-06-18 02:32:41 +00:00 |
GuangyaoZhang
|
7a2b08646f
|
Remove CohereLayerNorm and use existing layernorm
|
2024-06-18 02:32:41 +00:00 |
GuangyaoZhang
|
fe2e74c03a
|
fix precommit
|
2024-06-18 02:31:33 +00:00 |
GuangyaoZhang
|
f656d61778
|
change command
|
2024-06-18 02:31:33 +00:00 |
GuangyaoZhang
|
0b81163bc0
|
Copy llama to command
|
2024-06-18 02:31:33 +00:00 |