Commit Graph

396 Commits (83716e9feb8c2f16bbc6b007885cfdcec92af978)

Author SHA1 Message Date
github-actions[bot] 8921a73c90
[format] applied code formatting on changed files in pull request 5067 (#5072)
1 year ago
Xu Kai fb103cfd6e
[inference] update examples and engine (#5073)
1 year ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067)
1 year ago
Cuiqing Li (李崔卿) bce919708f
[Kernels]added flash-decoidng of triton (#5063)
1 year ago
Xu Kai fd6482ad8c
[inference] Refactor inference architecture (#5057)
1 year ago
flybird11111 bc09b95f50
[exampe] fix llama example' loss error when using gemini plugin (#5060)
1 year ago
Elsa Granger b2ad0d9e8f
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark (#5017)
1 year ago
Cuiqing Li (李崔卿) 28052a71fb
[Kernels]Update triton kernels into 2.1.0 (#5046)
1 year ago
Zhongkai Zhao 70885d707d
[hotfix] Suport extra_kwargs in ShardConfig (#5031)
1 year ago
Wenhao Chen 724441279b
[moe]: fix ep/tp tests, add hierarchical all2all (#4982)
1 year ago
Xuanlei Zhao f71e63b0f3
[moe] support optimizer checkpoint (#5015)
1 year ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978)
1 year ago
Cuiqing Li 459a88c806
[Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding for llama token attention (#4965)
1 year ago
アマデウス 4e4a10c97d
updated c++17 compiler flags (#4983)
1 year ago
Jianghai c6cd629e7a
[Inference]ADD Bench Chatglm2 script (#4963)
1 year ago
Xu Kai 785802e809
[inference] add reference and fix some bugs (#4937)
1 year ago
Cuiqing Li 3a41e8304e
[Refactor] Integrated some lightllm kernels into token-attention (#4946)
1 year ago
Xu Kai 611a5a80ca
[inference] Add smmoothquant for llama (#4904)
1 year ago
Blagoy Simandoff 8aed02b957
[nfc] fix minor typo in README (#4846)
1 year ago
Xu Kai d1fcc0fa4d
[infer] fix test bug (#4838)
1 year ago
Jianghai 013a4bedf0
[inference]fix import bug and delete down useless init (#4830)
1 year ago
Yuanheng Zhao 573f270537
[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841)
1 year ago
Yuanheng Zhao 3a74eb4b3a
[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771)
1 year ago
binmakeswell 822051d888
[doc] update slack link (#4823)
1 year ago
flybird11111 26cd6d850c
[fix] fix weekly runing example (#4787)
1 year ago
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754)
1 year ago
Baizhou Zhang df66741f77
[bug] fix get_default_parser in examples (#4764)
1 year ago
Wenhao Chen 7b9b86441f
[chat]: update rm, add wandb and fix bugs (#4471)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
github-actions[bot] 3c6b831c26
[format] applied code formatting on changed files in pull request 4743 (#4750)
1 year ago
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743)
1 year ago
flybird11111 4c4482f3ad
[example] llama2 add fine-tune example (#4673)
1 year ago
Bin Jia 608cffaed3
[example] add gpt2 HybridParallelPlugin example (#4653)
1 year ago
binmakeswell ce97790ed7
[doc] fix llama2 code link (#4726)
1 year ago
Baizhou Zhang 068372a738
[doc] add potential solution for OOM in llama2 example (#4699)
1 year ago
Cuiqing Li bce0f16702
[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577)
1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671)
1 year ago
flybird11111 7486ed7d3a
[shardformer] update llama2/opt finetune example and fix llama2 policy (#4645)
1 year ago
Baizhou Zhang 295b38fecf
[example] update vit example for hybrid parallel plugin (#4641)
1 year ago
Baizhou Zhang 660eed9124
[pipeline] set optimizer to optional in execute_pipeline (#4630)
1 year ago
Hongxin Liu fae6c92ead
Merge branch 'main' into feature/shardformer
1 year ago
Hongxin Liu ac178ca5c1 [legacy] move builder and registry to legacy (#4603)
1 year ago
Hongxin Liu 8accecd55b [legacy] move engine to legacy (#4560)
1 year ago
Hongxin Liu 89fe027787 [legacy] move trainer to legacy (#4545)
1 year ago
flybird11111 ec0866804c
[shardformer] update shardformer readme (#4617)
1 year ago
Hongxin Liu a39a5c66fe
Merge branch 'main' into feature/shardformer
1 year ago
flybird11111 0a94fcd351
[shardformer] update bert finetune example with HybridParallelPlugin (#4584)
1 year ago
binmakeswell 8d7b02290f
[doc] add llama2 benchmark (#4604)
1 year ago
Tian Siyuan f1ae8c9104
[example] change accelerate version (#4431)
1 year ago
ChengDaqi2023 8e2e1992b8
[example] update streamlit 0.73.1 to 1.11.1 (#4386)
1 year ago
Hongxin Liu 0b00def881
[example] add llama2 example (#4527)
1 year ago
Hongxin Liu 27061426f7
[gemini] improve compatibility and add static placement policy (#4479)
1 year ago
Tian Siyuan ff836790ae
[doc] fix a typo in examples/tutorial/auto_parallel/README.md (#4430)
1 year ago
binmakeswell 089c365fa0
[doc] add Series A Funding and NeurIPS news (#4377)
1 year ago
caption 16c0acc01b
[hotfix] update gradio 3.11 to 3.34.0 (#4329)
1 year ago
binmakeswell ef4b99ebcd add llama example CI
1 year ago
binmakeswell 7ff11b5537
[example] add llama pretraining (#4257)
1 year ago
github-actions[bot] 4e9b09c222
Automated submodule synchronization (#4217)
1 year ago
digger yu 2d40759a53
fix #3852 path error (#4058)
1 year ago
Jianghai 31dc302017
[examples] copy resnet example to image (#4090)
1 year ago
Baizhou Zhang 4da324cd60
[hotfix]fix argument naming in docs and examples (#4083)
1 year ago
LuGY 160c64c645
[example] fix bucket size in example of gpt gemini (#4028)
1 year ago
Baizhou Zhang b3ab7fbabf
[example] update ViT example using booster api (#3940)
1 year ago
Liu Ziming e277534a18
Merge pull request #3905 from MaruyamaAya/dreambooth
1 year ago
digger yu 33eef714db
fix typo examples and docs (#3932)
1 year ago
Maruyama_Aya 9b5e7ce21f modify shell for check
1 year ago
digger yu 407aa48461
fix typo examples/community/roberta (#3925)
1 year ago
Maruyama_Aya 730a092ba2 modify shell for check
1 year ago
Maruyama_Aya 49567d56d1 modify shell for check
1 year ago
Maruyama_Aya 039854b391 modify shell for check
1 year ago
Baizhou Zhang e417dd004e
[example] update opt example using booster api (#3918)
1 year ago
Maruyama_Aya cf4792c975 modify shell for check
1 year ago
Maruyama_Aya c94a33579b modify shell for check
1 year ago
Liu Ziming b306cecf28
[example] Modify palm example with the new booster API (#3913)
1 year ago
wukong1992 a55fb00c18
[booster] update bert example, using booster api (#3885)
1 year ago
Maruyama_Aya 4fc8bc68ac modify file path
1 year ago
Maruyama_Aya b4437e88c3 fixed port
1 year ago
Maruyama_Aya 79c9f776a9 fixed port
1 year ago
Maruyama_Aya d3379f0be7 fixed model saving bugs
1 year ago
Maruyama_Aya b29e1f0722 change directory
1 year ago
Maruyama_Aya 1c1f71cbd2 fixing insecure hash function
1 year ago
Maruyama_Aya b56c7f4283 update shell file
1 year ago
Maruyama_Aya 176010f289 update performance evaluation
1 year ago
Maruyama_Aya 25447d4407 modify path
2 years ago
Maruyama_Aya 60ec33bb18 Add a new example of Dreambooth training using the booster API
2 years ago
jiangmingyan 5f79008c4a
[example] update gemini examples (#3868)
2 years ago
digger yu 518b31c059
[docs] change placememt_policy to placement_policy (#3829)
2 years ago
github-actions[bot] 62c7e67f9f
[format] applied code formatting on changed files in pull request 3786 (#3787)
2 years ago
binmakeswell ad2cf58f50
[chat] add performance and tutorial (#3786)
2 years ago
binmakeswell 15024e40d9
[auto] fix install cmd (#3772)
2 years ago
digger-yu b7141c36dd
[CI] fix some spelling errors (#3707)
2 years ago
Hongxin Liu 3bf09efe74
[booster] update prepare dataloader method for plugin (#3706)
2 years ago
Hongxin Liu f83ea813f5
[example] add train resnet/vit with booster example (#3694)
2 years ago
Hongxin Liu d556648885
[example] add finetune bert with booster example (#3693)
2 years ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
2 years ago
github-actions[bot] d544ed4345
[bot] Automated submodule synchronization (#3596)
2 years ago
digger-yu d0fbd4b86f
[example] fix community doc (#3586)
2 years ago
binmakeswell f1b3d60cae
[example] reorganize for community examples (#3557)
2 years ago
natalie_cao de84c0311a Polish Code
2 years ago
binmakeswell 0c0455700f
[doc] add requirement and highlight application (#3516)
2 years ago