ColossalAI

Commit Graph

Author	SHA1	Message	Date
mandoxzhang	8f2c55f9c9	[example] remove redundant texts & update roberta (#3493 ) * update roberta example * update roberta example * modify conflict & update roberta	2 years ago
mandoxzhang	ab5fd127e3	[example] update roberta with newer ColossalAI (#3472 ) * update roberta example * update roberta example	2 years ago
NatalieC323	fb8fae6f29	Revert "[dreambooth] fixing the incompatibity in requirements.txt (#3190 ) (#3378 )" (#3481 )	2 years ago
binmakeswell	891b8e7fac	[chat] fix stage3 PPO sample sh command (#3477 )	2 years ago
NatalieC323	c701b77b11	[dreambooth] fixing the incompatibity in requirements.txt (#3190 ) (#3378 ) * Update requirements.txt * Update environment.yaml * Update README.md * Update environment.yaml * Update README.md * Update README.md * Delete requirements_colossalai.txt * Update requirements.txt * Update README.md	2 years ago
Frank Lee	4e9989344d	[doc] updated contributor list (#3474 )	2 years ago
jiangmingyan	52a933e175	[checkpoint] support huggingface style sharded checkpoint (#3461 ) * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint --------- Co-authored-by: luchen <luchen@luchendeMBP.lan>	2 years ago
Fazzie-Maqianli	6afeb1202a	add community example dictionary (#3465 )	2 years ago
Frank Lee	80eba05b0a	[test] refactor tests with spawn (#3452 ) * [test] added spawn decorator * polish code * polish code * polish code * polish code * polish code * polish code	2 years ago
YY Lin	62f4e2eb07	[Chat]Add Peft support & fix the ptx bug (#3433 ) * Update ppo.py Fix the bug of fetching wrong batch data * Add peft model support in SFT and Prompts training In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files. * Delete test_prompts.txt * Delete test_pretrained.txt * Move the peft stuffs to a community folder. * Move the demo sft to community * delete dirty files * Add instructions to install peft using source * Remove Chinese comments * remove the Chinese comments	2 years ago
Dr-Corgi	73afb63594	[chat]fix save_model(#3377 ) The function save_model should be a part of PPOTrainer.	2 years ago
kingkingofall	57a3c4db6d	[chat]fix readme (#3429 ) * fix stage 2 fix stage 2 * add torch	2 years ago
Frank Lee	7d8d825681	[booster] fixed the torch ddp plugin with the new checkpoint api (#3442 )	2 years ago
YH	8f740deb53	Fix typo (#3448 )	2 years ago
ver217	933048ad3e	[test] reorganize zero/gemini tests (#3445 )	2 years ago
Camille Zhong	72cb4dd433	[Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453 ) * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * update roberta with coati * chat ci update * Revert "chat ci update" This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846. * [Chat] fix the tokenizer "int too big to convert" error in SFT training fix the tokenizer error during SFT training using Bloom and OPT	2 years ago
Hakjin Lee	46c009dba4	[format] Run lint on colossalai.engine (#3367 )	2 years ago
Yuanchen	b92313903f	fix save_model indent error in ppo trainer (#3450 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2 years ago
YuliangLiu0306	ffcdbf0f65	[autoparallel]integrate auto parallel feature with new tracer (#3408 ) * [autoparallel] integrate new analyzer in module level * unify the profiling method * polish * fix no codegen bug * fix pass bug * fix liveness test * polish	2 years ago
ver217	573af84184	[example] update examples related to zero/gemini (#3431 ) * [zero] update legacy import * [zero] update examples * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix import	2 years ago
Yuanchen	773955abfa	fix save_model inin naive and ddp strategy (#3436 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2 years ago
Frank Lee	1beb85cc25	[checkpoint] refactored the API and added safetensors support (#3427 ) * [checkpoint] refactored the API and added safetensors support * polish code	2 years ago
ver217	26b7aac0be	[zero] reorganize zero/gemini folder structure (#3424 ) * [zero] refactor low-level zero folder structure * [zero] fix legacy zero import path * [zero] fix legacy zero import path * [zero] remove useless import * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] fix test import path * [zero] fix test * [zero] fix circular import * [zero] update import	2 years ago
Yuanchen	b09adff724	[chat]fix sft training for bloom, gpt and opt (#3418 ) fix sft training for bloom, gpt and opt	2 years ago
Frank Lee	638a07a7f9	[test] fixed gemini plugin test (#3411 ) * [test] fixed gemini plugin test * polish code * polish code	2 years ago
Camille Zhong	30412866e0	[chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223 ) * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * add test for reward model training * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * update roberta with coati	2 years ago
Chris Sundström	94c24d9444	Improve grammar and punctuation (#3398 ) Minor changes to improve grammar and punctuation.	2 years ago
Jan Roudaut	dd367ce795	[doc] polish diffusion example (#3386 ) * [examples/images/diffusion]: README.md: typo fixes * Update README.md * Grammar fixes * Reformulated "Step 3" (xformers) introduction to the cost => at the cost + reworded pip availability.	2 years ago
Jan Roudaut	51cd2fec57	Typofix: malformed `xformers` version (#3384 ) s/0.12.0/0.0.12/	2 years ago
ver217	5f2e34e6c9	[booster] implement Gemini plugin (#3352 ) * [booster] add gemini plugin * [booster] update docstr * [booster] gemini plugin add coloparam convertor * [booster] fix coloparam convertor * [booster] fix gemini plugin device * [booster] add gemini plugin test * [booster] gemini plugin ignore sync bn * [booster] skip some model * [booster] skip some model * [booster] modify test world size * [booster] modify test world size * [booster] skip test	2 years ago
HELSON	1a1d68b053	[moe] add checkpoint for moe models (#3354 ) * [moe] add checkpoint for moe models * [hotfix] fix bugs in unit test	2 years ago
YuliangLiu0306	fee2af8610	[autoparallel] adapt autoparallel with new analyzer (#3261 ) * [autoparallel] adapt autoparallel with new analyzer * fix all node handler tests * polish * polish	2 years ago
アマデウス	e78a1e949a	fix torch 2.0 compatibility (#3346 )	2 years ago
Ofey Chan	8706a8c66c	[NFC] polish colossalai/engine/gradient_handler/__init__.py code style (#3329 )	2 years ago
yuxuan-lou	198a74b9fd	[NFC] polish colossalai/context/random/__init__.py code style (#3327 )	2 years ago
Andrew	82132f4e3d	[chat] correcting a few obvious typos and grammars errors (#3338 )	2 years ago
YuliangLiu0306	fbd2a9e05b	[hotfix] meta_tensor_compatibility_with_torch2	2 years ago
binmakeswell	15a74da79c	[doc] add Intel cooperation news (#3333 ) * [doc] add Intel cooperation news * [doc] add Intel cooperation news	2 years ago
Michelle	ad285e1656	[NFC] polish colossalai/fx/tracer/_tracer_utils.py (#3323 ) * [NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style * [NFC] polish colossalai/fx/tracer/_tracer_utils.py code style --------- Co-authored-by: Qianran Ma <qianranm@luchentech.com>	2 years ago
Xu Kai	64350029fe	[NFC] polish colossalai/gemini/paramhooks/_param_hookmgr.py code style	2 years ago
RichardoLuo	1ce9d0c531	[NFC] polish initializer_data.py code style (#3287 )	2 years ago
Ziheng Qin	1bed38ef37	[NFC] polish colossalai/cli/benchmark/models.py code style (#3290 )	2 years ago
Kai Wang (Victor Kai)	964a28678f	[NFC] polish initializer_3d.py code style (#3279 )	2 years ago
Sze-qq	94eec1c5ad	[NFC] polish colossalai/engine/gradient_accumulation/_gradient_accumulation.py code style (#3277 ) Co-authored-by: siqi <siqi@siqis-MacBook-Pro.local>	2 years ago
Arsmart1	8af977f223	[NFC] polish colossalai/context/parallel_context.py code style (#3276 )	2 years ago
Zirui Zhu	1168b50e33	[NFC] polish colossalai/engine/schedule/_pipeline_schedule_v2.py code style (#3275 )	2 years ago
Tong Li	196d4696d0	[NFC] polish colossalai/nn/_ops/addmm.py code style (#3274 )	2 years ago
lucasliunju	4b95464994	[NFC] polish colossalai/amp/__init__.py code style (#3272 )	2 years ago
Xuanlei Zhao	6b3bb2c249	[NFC] polish code style (#3273 )	2 years ago
CZYCW	4cadb25b96	[NFC] policy colossalai/fx/proxy.py code style (#3269 )	2 years ago

1 2 3 4 5 ...

2340 Commits (b4788d63ed3bc8e650c878df24f86c8ddaed124e) All Branches Search

2340 Commits (b4788d63ed3bc8e650c878df24f86c8ddaed124e)

All Branches