Commit Graph

22 Commits (8cc8f645cd1d971a3bef52f625b7881f17c6d22b)

Author SHA1 Message Date
Guangyao Zhang 669849d74b
[ShardFormer] Add Ulysses Sequence Parallelism support for Command-R, Qwen2 and ChatGLM (#5897)
5 months ago
Zhongkai Zhao 8e412a548e
[shardformer] Sequence Parallelism Optimization (#5533)
8 months ago
digger yu 049121d19d
[hotfix] fix typo change enabel to enable under colossalai/shardformer/ (#5317)
9 months ago
flybird11111 29695cf70c
[example]add gpt2 benchmark example script. (#5295)
9 months ago
flybird11111 02d2328a04
support linear accumulation fusion (#5199)
11 months ago
アマデウス 126cf180bc
[hotfix] fixed memory usage of shardformer module replacement (#5122)
1 year ago
flybird11111 576a2f7b10
[gemini] gemini support tensor parallelism. (#4942)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Bin Jia 86d22581e4
[shardformer] Add overlap optional for HybridParallelPlugin (#4615)
1 year ago
Bin Jia e241b74f24
[shardformer] Add overlap support for gpt2 (#4535)
1 year ago
Bin Jia c554b7f559
[shardformer/fix overlap bug] fix overlap bug, add overlap as an option in shardco… (#4516)
1 year ago
flybird11111 a27e0bb494
[shardformer] bert support sequence parallel. (#4455)
1 year ago
Bin Jia 7c8be77081
[shardformer/sequence parallel] support gpt2 seq parallel with pp/dp/tp (#4460)
1 year ago
Bin Jia 424629fea0
[shardformer/sequence parallel] Cherry pick commit to new branch (#4450)
1 year ago
github-actions[bot] c77b3b19be
[format] applied code formatting on changed files in pull request 4152 (#4157)
1 year ago
Kun Lin 8af29ee47a [shardformer] support vision transformer (#4096)
1 year ago
Frank Lee 70c58cfd4f [shardformer] supported fused qkv checkpoint (#4073)
1 year ago
Frank Lee f22ddacef0 [shardformer] refactored the shardformer layer structure (#4053)
1 year ago
Frank Lee 015af592f8 [shardformer] integrated linear 1D with dtensor (#3996)
1 year ago
FoolPlayer ab8a47f830 [shardformer] add Dropout layer support different dropout pattern (#3856)
1 year ago
Frank Lee ddcf58cacf
Revert "[sync] sync feature/shardformer with develop"
1 year ago
FoolPlayer 21a3915c98 [shardformer] add Dropout layer support different dropout pattern (#3856)
1 year ago