Commit Graph

84 Commits (19d1510ea26d10484a804eb62f6d03dbcc7c80a8)

Author SHA1 Message Date
hxwang 3e2b6132b7 [moe] clean legacy code
4 months ago
botbw 9b9b76bdcd [moe] add mixtral dp grad scaling when not all experts are activated
4 months ago
Hongxin Liu 7b38964e3a
[shardformer] hotfix attn mask (#5947)
4 months ago
Insu Jang a521ffc9f8
Add n_fused as an input from native_module (#5894)
4 months ago
Guangyao Zhang 669849d74b
[ShardFormer] Add Ulysses Sequence Parallelism support for Command-R, Qwen2 and ChatGLM (#5897)
5 months ago
Edenzzzz fbf33ecd01
[Feature] Enable PP + SP for llama (#5868)
5 months ago
Haze188 416580b314
[MoE/ZeRO] Moe refactor with zero refactor (#5821)
5 months ago
Runyu Lu 3c7cda0c9a
[Inference]Lazy Init Support (#5785)
5 months ago
GuangyaoZhang 7a2b08646f Remove CohereLayerNorm and use existing layernorm
5 months ago
GuangyaoZhang fe2e74c03a fix precommit
5 months ago
GuangyaoZhang f656d61778 change command
5 months ago
Li Xingjian 8554585a5f
[Inference] Fix flash-attn import and add model test (#5794)
6 months ago
Yuanheng Zhao df6747603f
[Colossal-Inference] (v0.1.0) Merge pull request #5739 from hpcaitech/feature/colossal-infer
6 months ago
Haze188 22ce873c3f
[Shardformer] Add parallel output for shardformer models(bloom, falcon) (#5702)
6 months ago
Jianghai 61a1b2e798 [Inference] Fix bugs and docs for feat/online-server (#5598)
7 months ago
CjhHa1 7bbb28e48b [Inference] resolve rebase conflicts
7 months ago
Hongxin Liu bbb2c21f16
[shardformer] fix chatglm implementation (#5644)
7 months ago
flybird11111 148506c828
[coloattention]modify coloattention (#5627)
7 months ago
Edenzzzz 7ee569b05f
[hotfix] Fixed fused layernorm bug without apex (#5609)
7 months ago
flybird11111 a0ad587c24
[shardformer] refactor embedding resize (#5603)
7 months ago
Hongxin Liu 641b1ee71a
[devops] remove post commit ci (#5566)
8 months ago
Zhongkai Zhao 8e412a548e
[shardformer] Sequence Parallelism Optimization (#5533)
8 months ago
Hongxin Liu 19e1a5cf16
[shardformer] update colo attention to support custom mask (#5510)
8 months ago
digger yu 049121d19d
[hotfix] fix typo change enabel to enable under colossalai/shardformer/ (#5317)
9 months ago
flybird11111 29695cf70c
[example]add gpt2 benchmark example script. (#5295)
9 months ago
ver217 148469348a Merge branch 'main' into sync/npu
10 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Xuanlei Zhao dd2c28a323
[npu] use extension for op builder (#5172)
11 months ago
flybird11111 02d2328a04
support linear accumulation fusion (#5199)
11 months ago
flybird11111 79718fae04
[shardformer] llama support DistCrossEntropy (#5176)
12 months ago
Xuanlei Zhao d6df19bae7
[npu] support triangle attention for llama (#5130)
1 year ago
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
1 year ago
アマデウス 126cf180bc
[hotfix] fixed memory usage of shardformer module replacement (#5122)
1 year ago
Xuanlei Zhao 3acbf6d496
[npu] add npu support for hybrid plugin and llama (#5090)
1 year ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067)
1 year ago
flybird11111 576a2f7b10
[gemini] gemini support tensor parallelism. (#4942)
1 year ago
Jianghai ef4c14a5e2
[Inference] Fix bug in ChatGLM2 Tensor Parallelism (#5014)
1 year ago
littsk 1a3315e336
[hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Bin Jia 86d22581e4
[shardformer] Add overlap optional for HybridParallelPlugin (#4615)
1 year ago
Bin Jia e241b74f24
[shardformer] Add overlap support for gpt2 (#4535)
1 year ago
Bin Jia c554b7f559
[shardformer/fix overlap bug] fix overlap bug, add overlap as an option in shardco… (#4516)
1 year ago
Baizhou Zhang 44eab2b27f
[shardformer] support sharded checkpoint IO for models of HybridParallelPlugin (#4506)
1 year ago
flybird11111 59e252ecdb
[shardformer] chatglm support sequence parallel (#4482)
1 year ago
flybird11111 a27e0bb494
[shardformer] bert support sequence parallel. (#4455)
1 year ago
Bin Jia 7c8be77081
[shardformer/sequence parallel] support gpt2 seq parallel with pp/dp/tp (#4460)
1 year ago
Bin Jia 424629fea0
[shardformer/sequence parallel] Cherry pick commit to new branch (#4450)
1 year ago
ver217 5d4efdf58f [shardformer] fix import
1 year ago
ver217 73a4144b91 [shardformer] fix embedding
1 year ago
Hongxin Liu 172f7fa3cf [misc] resolve code factor issues (#4433)
1 year ago