Commit Graph

1629 Commits (4146f1c0ceea4de649f8639d11c26901187cb294)

Author SHA1 Message Date
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724)
1 year ago
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754)
1 year ago
littsk 1e0e080837
[bug] Fix the version check bug in colossalai run when generating the cmd. (#4713)
1 year ago
Hongxin Liu 3e05c07bb8
[lazy] support torch 2.0 (#4763)
1 year ago
Baizhou Zhang df66741f77
[bug] fix get_default_parser in examples (#4764)
1 year ago
Baizhou Zhang c0a033700c
[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743)
1 year ago
Xuanlei Zhao 32e7f99416
[kernel] update triton init #4740 (#4740)
1 year ago
flybird11111 4c4482f3ad
[example] llama2 add fine-tune example (#4673)
1 year ago
Xuanlei Zhao ac2797996b
[shardformer] add custom policy in hybrid parallel plugin (#4718)
1 year ago
Baizhou Zhang f911d5b09d
[doc] Add user document for Shardformer (#4702)
1 year ago
flybird11111 20190b49a5
[shardformer] to fix whisper test failed due to significant accuracy differences. (#4710)
1 year ago
Yuanheng Zhao e2c0e7f92a
[hotfix] Fix import error: colossal.kernel without triton installed (#4722)
1 year ago
flybird11111 c7d6975d29
[shardformer] fix GPT2DoubleHeadsModel (#4703)
1 year ago
digger yu 9c2feb2f0b
fix some typo with colossalai/device colossalai/tensor/ etc. (#4171)
1 year ago
Baizhou Zhang d8ceeac14e
[hotfix] fix typo in hybrid parallel io (#4697)
1 year ago
flybird11111 8844691f4b
[shardformer] update shardformer readme (#4689)
1 year ago
Baizhou Zhang 1d454733c4
[doc] Update booster user documents. (#4669)
1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577)
1 year ago
flybird11111 eedaa3e1ef
[shardformer]fix gpt2 double head (#4663)
1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671)
1 year ago
flybird11111 7486ed7d3a
[shardformer] update llama2/opt finetune example and fix llama2 policy (#4645)
1 year ago
Baizhou Zhang 295b38fecf
[example] update vit example for hybrid parallel plugin (#4641)
1 year ago
Baizhou Zhang 660eed9124
[pipeline] set optimizer to optional in execute_pipeline (#4630)
1 year ago
eric8607242 c3d5fa3bac
[shardformer] Support customized policy for llamav2 based model with HybridParallelPlugin (#4624)
1 year ago
Hongxin Liu fae6c92ead
Merge branch 'main' into feature/shardformer
1 year ago
Hongxin Liu ac178ca5c1 [legacy] move builder and registry to legacy (#4603)
1 year ago
Hongxin Liu 8accecd55b [legacy] move engine to legacy (#4560)
1 year ago
Hongxin Liu 89fe027787 [legacy] move trainer to legacy (#4545)
1 year ago
Hongxin Liu 807e01a4ba
[zero] hotfix master param sync (#4618)
1 year ago
flybird11111 ec0866804c
[shardformer] update shardformer readme (#4617)
1 year ago
Bin Jia 86d22581e4
[shardformer] Add overlap optional for HybridParallelPlugin (#4615)
1 year ago
Hongxin Liu a39a5c66fe
Merge branch 'main' into feature/shardformer
1 year ago
Baizhou Zhang e79b1e80e2
[checkpointio] support huggingface from_pretrained for all plugins (#4606)
1 year ago
flybird11111 0a94fcd351
[shardformer] update bert finetune example with HybridParallelPlugin (#4584)
1 year ago
Jianghai 24c0768795
[shardformer] Pytree fix (#4533)
1 year ago
Hongxin Liu 63ecafb1fb
[checkpointio] optimize zero optim checkpoint io (#4591)
1 year ago
Hongxin Liu 508ca36fe3
[pipeline] 1f1b schedule receive microbatch size (#4589)
1 year ago
LuGY cbac782254
[zero]fix zero ckptIO with offload (#4529)
1 year ago
Baizhou Zhang 38ccb8b1a3
[shardformer] support from_pretrained when loading model with HybridParallelPlugin (#4575)
1 year ago
Baizhou Zhang c9625dbb63
[shardformer] support sharded optimizer checkpointIO of HybridParallelPlugin (#4540)
1 year ago
Baizhou Zhang 2c787d7f47
[shardformer] fix submodule replacement bug when enabling pp (#4544)
1 year ago
flybird11111 ec18fc7340
[shardformer] support pp+tp+zero1 tests (#4531)
1 year ago
Lufang Chen 12c95a9fed
fix runtime prepare pass (#4502)
1 year ago
flybird11111 d367b88785
[shardformer] fix opt test hanging (#4521)
1 year ago
Bin Jia e241b74f24
[shardformer] Add overlap support for gpt2 (#4535)
1 year ago
Baizhou Zhang 0387a47e63
[shardformer] fix emerged bugs after updating transformers (#4526)
1 year ago
Hongxin Liu 0b00def881
[example] add llama2 example (#4527)
1 year ago
Bin Jia c554b7f559
[shardformer/fix overlap bug] fix overlap bug, add overlap as an option in shardco… (#4516)
1 year ago