Commit Graph

24 Commits (810cafb2f987cac2bbe99ef491455921f197f315)

Author SHA1 Message Date
duanjunwen 1d328ff651 Merge branch 'main' into dev/zero_bubble
4 weeks ago
duanjunwen 5f0924361d [fix] fix linear (no tp) ops func name;
4 weeks ago
duanjunwen d2e05a99b3 [feat] support no tensor parallel Linear in shardformer; Add test for use weightGradStore and not use WeightGradStore
4 weeks ago
Hongxin Liu 646b3c5a90
[shardformer] fix linear 1d row and support uneven splits for fused qkv linear (#6084)
2 months ago
Edenzzzz f5c84af0b0
[Feature] Zigzag Ring attention (#5905)
3 months ago
Edenzzzz fbf33ecd01
[Feature] Enable PP + SP for llama (#5868)
5 months ago
flybird11111 a0ad587c24
[shardformer] refactor embedding resize (#5603)
7 months ago
Hongxin Liu 641b1ee71a
[devops] remove post commit ci (#5566)
8 months ago
Zhongkai Zhao 8e412a548e
[shardformer] Sequence Parallelism Optimization (#5533)
8 months ago
Hongxin Liu 19e1a5cf16
[shardformer] update colo attention to support custom mask (#5510)
8 months ago
littsk 1a3315e336
[hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
ver217 5d4efdf58f [shardformer] fix import
1 year ago
FoolPlayer dd2bf02679 [shardformer] support SAM (#4231)
1 year ago
Hongxin Liu d921ce8391 [shardformer] support inplace sharding (#4251)
1 year ago
Frank Lee f3b6aaa6b7 [shardformer] supported fused normalization (#4112)
1 year ago
Frank Lee b1c2901530 [shardformer] supported bloom model (#4098)
1 year ago
Frank Lee d33a44e8c3 [shardformer] refactored layernorm (#4086)
1 year ago
FoolPlayer 92f6791095 [shardformer] Add layernorm (#4072)
1 year ago
Frank Lee f22ddacef0 [shardformer] refactored the shardformer layer structure (#4053)
1 year ago
FoolPlayer 4021b9a8a2 [shardformer] add gpt2 test and layer class refactor (#4041)
1 year ago
FoolPlayer ab8a47f830 [shardformer] add Dropout layer support different dropout pattern (#3856)
1 year ago
Frank Lee ddcf58cacf
Revert "[sync] sync feature/shardformer with develop"
1 year ago
FoolPlayer 21a3915c98 [shardformer] add Dropout layer support different dropout pattern (#3856)
1 year ago