ColossalAI

Commit Graph

Author	SHA1	Message	Date
csric	e355144375	[chatgpt] Detached PPO Training (#3195 ) * run the base * working on dist ppo * sync * detached trainer * update detached trainer. no maker update function * facing init problem * 1 maker 1 trainer detached run. but no model update * facing cuda problem * fix save functions * verified maker update * nothing * add ignore * analyize loss issue * remove some debug codes * facing 2m1t stuck issue * 2m1t verified * do not use torchrun * working on 2m2t * working on 2m2t * initialize strategy in ray actor env * facing actor's init order issue * facing ddp model update issue (need unwarp ddp) * unwrap ddp actor * checking 1m2t stuck problem * nothing * set timeout for trainer choosing. It solves the stuck problem! * delete some debug output * rename to sync with upstream * rename to sync with upstream * coati rename * nothing * I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations * experience_maker_holder performs target-revolving _send_experience() instead of length comparison. * move code to ray subfolder * working on pipeline inference * apply comments --------- Co-authored-by: csric <richcsr256@gmail.com>	2023-04-17 14:46:50 +08:00
MisterLin1995	1a809eddaa	[chat] ChatGPT train prompts on ray example (#3309 ) * [feat][chatgpt]train prompts on ray example * [fix]simplify code * [fix]remove depreciated parameter * [fix]add dependencies * [fix]method calling * [fix]experience maker * [fix]missing loss function * [fix]init optimizer * [feat]add usage comment * [fix]rename files * [fix]add readme * [fix]file path * [fix]move directory --------- Co-authored-by: jiangwen <zxl265370@antgroup.com>	2023-04-13 18:18:36 +08:00
binmakeswell	535b896435	[chat] polish tutorial doc (#3551 ) * [chat] clean up duplicate tutorial * [chat] clean up duplicate tutorial * [chat] clean up duplicate tutorial * [chat] clean up duplicate tutorial	2023-04-13 18:11:48 +08:00
Yuanchen	7182ac2a04	[chat]add examples of training with limited resources in chat readme (#3536 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2023-04-12 15:47:09 +08:00
zhang-yi-chi	e6a132a449	[chat]: add vf_coef argument for PPOTrainer (#3318 )	2023-04-11 09:54:59 +08:00
ver217	89fd10a1c9	[chat] add zero2 cpu strategy for sft training (#3520 )	2023-04-10 19:00:13 +08:00
binmakeswell	990d4c3e4e	[doc] hide diffusion in application path (#3519 ) - [ ] Stable Diffusion - [ ] Dreambooth It's easy for users to think that we don't support them yet. Add them after migrating them from example to application https://github.com/hpcaitech/ColossalAI/tree/main/examples/images	2023-04-10 17:52:24 +08:00
binmakeswell	0c0455700f	[doc] add requirement and highlight application (#3516 ) * [doc] add requirement and highlight application * [doc] link example and application	2023-04-10 17:37:16 +08:00
NatalieC323	635d0a1baf	[Chat Community] Update README.md (fixed#3487) (#3506 ) * Update README.md * Update README.md * Update README.md * Update README.md --------- Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>	2023-04-10 14:36:39 +08:00
gongenlei	a7ca297281	[coati] Fix LlamaCritic (#3475 ) * mv LlamaForCausalLM to LlamaModel * rm unused imports --------- Co-authored-by: gongenlei <gongenlei@baidu.com>	2023-04-07 11:39:09 +08:00
binmakeswell	891b8e7fac	[chat] fix stage3 PPO sample sh command (#3477 )	2023-04-06 18:08:16 +08:00
Fazzie-Maqianli	6afeb1202a	add community example dictionary (#3465 )	2023-04-06 15:04:48 +08:00
Frank Lee	80eba05b0a	[test] refactor tests with spawn (#3452 ) * [test] added spawn decorator * polish code * polish code * polish code * polish code * polish code * polish code	2023-04-06 14:51:35 +08:00
YY Lin	62f4e2eb07	[Chat]Add Peft support & fix the ptx bug (#3433 ) * Update ppo.py Fix the bug of fetching wrong batch data * Add peft model support in SFT and Prompts training In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files. * Delete test_prompts.txt * Delete test_pretrained.txt * Move the peft stuffs to a community folder. * Move the demo sft to community * delete dirty files * Add instructions to install peft using source * Remove Chinese comments * remove the Chinese comments	2023-04-06 11:54:52 +08:00
Dr-Corgi	73afb63594	[chat]fix save_model(#3377 ) The function save_model should be a part of PPOTrainer.	2023-04-06 11:19:14 +08:00
kingkingofall	57a3c4db6d	[chat]fix readme (#3429 ) * fix stage 2 fix stage 2 * add torch	2023-04-06 10:58:53 +08:00
Camille Zhong	72cb4dd433	[Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453 ) * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * update roberta with coati * chat ci update * Revert "chat ci update" This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846. * [Chat] fix the tokenizer "int too big to convert" error in SFT training fix the tokenizer error during SFT training using Bloom and OPT	2023-04-06 09:30:28 +08:00
Yuanchen	b92313903f	fix save_model indent error in ppo trainer (#3450 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2023-04-05 09:45:42 +08:00
Yuanchen	773955abfa	fix save_model inin naive and ddp strategy (#3436 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2023-04-04 15:30:01 +08:00
ver217	26b7aac0be	[zero] reorganize zero/gemini folder structure (#3424 ) * [zero] refactor low-level zero folder structure * [zero] fix legacy zero import path * [zero] fix legacy zero import path * [zero] remove useless import * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] fix test import path * [zero] fix test * [zero] fix circular import * [zero] update import	2023-04-04 13:48:16 +08:00
Yuanchen	b09adff724	[chat]fix sft training for bloom, gpt and opt (#3418 ) fix sft training for bloom, gpt and opt	2023-04-04 09:46:23 +08:00
Camille Zhong	30412866e0	[chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223 ) * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * add test for reward model training * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit `06741d894d`. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * update roberta with coati	2023-04-03 10:11:03 +08:00
Andrew	82132f4e3d	[chat] correcting a few obvious typos and grammars errors (#3338 )	2023-03-30 14:18:37 +08:00
Fazzie-Maqianli	0fbadce79c	[doc] added authors to the chat application (#3307 )	2023-03-29 11:04:30 +08:00
BlueRum	b512893637	Polish readme link (#3306 )	2023-03-29 10:25:50 +08:00
github-actions[bot]	cb413ccf28	[format] applied code formatting on changed files in pull request 3300 (#3302 ) Co-authored-by: github-actions <github-actions@github.com>	2023-03-29 09:28:24 +08:00
binmakeswell	31c78f2be3	[doc] add ColossalChat news (#3304 ) * [doc] add ColossalChat news * [doc] add ColossalChat news	2023-03-29 09:27:55 +08:00
Frank Lee	e235a24673	[application] updated the README (#3301 ) * [application] updated the README * polish code	2023-03-29 08:47:00 +08:00
BlueRum	8257e1055d	[chat]polish prompts training (#3300 ) * polish train_prompts * polish readme	2023-03-29 08:44:16 +08:00
ver217	62f7156131	[coati] fix inference profanity check (#3299 )	2023-03-29 04:26:35 +08:00
github-actions[bot]	5134ad5d1a	[format] applied code formatting on changed files in pull request 3296 (#3298 ) Co-authored-by: github-actions <github-actions@github.com>	2023-03-29 02:35:40 +08:00
BlueRum	c8b723d6c2	[chat]Update Readme (#3296 ) * Update README.md * Update README.md * Update README.md * update example readme	2023-03-29 02:32:17 +08:00
ver217	73b542a124	[coati] inference supports profanity check (#3295 )	2023-03-29 02:14:35 +08:00
ver217	ce2cafae76	[coati] add repetition_penalty for inference (#3294 )	2023-03-29 01:18:45 +08:00
Fazzie-Maqianli	a88ed0f83a	add limit (#3293 )	2023-03-29 00:53:23 +08:00
Fazzie-Maqianli	c5484281aa	[ColossalChat]add cite for datasets (#3292 )	2023-03-29 00:38:36 +08:00
Fazzie-Maqianli	ec7af22a43	fix image (#3288 )	2023-03-28 23:34:21 +08:00
Fazzie-Maqianli	1f7d9afbf8	add example (#3286 )	2023-03-28 23:07:15 +08:00
ver217	4905b21b94	[coati] fix inference output (#3285 ) * [coati] fix inference requirements * [coati] add output postprocess * [coati] update inference readme * [coati] fix inference requirements	2023-03-28 21:20:28 +08:00
Fazzie-Maqianli	bb6196e71a	remove chatgpt (#3284 )	2023-03-28 20:29:09 +08:00
Fazzie-Maqianli	b0ce5a1032	[Coati] first commit (#3283 )	2023-03-28 20:25:36 +08:00
binmakeswell	d32ef94ad9	[doc] fix typo (#3222 ) * [doc] fix typo * [doc] fix typo	2023-03-24 13:33:35 +08:00
ver217	78fd31f9c1	[chatgpt] add precision option for colossalai (#3233 )	2023-03-24 12:15:06 +08:00
Fazzie-Maqianli	bd39877da4	support instrcut training (#3230 )	2023-03-24 11:45:01 +08:00
Camille Zhong	9bc702ab48	[doc] update chatgpt doc paper link (#3229 ) #issue 3189	2023-03-24 11:21:39 +08:00
Fazzie-Maqianli	bbac6760e5	fix torch version (#3225 )	2023-03-23 20:56:35 +08:00
Fazzie-Maqianli	fa97a9cab4	[chatgpt] unnify datasets (#3218 )	2023-03-23 17:38:30 +08:00
Fazzie-Maqianli	4fd4bd9d9a	[chatgpt] support instuct training (#3216 )	2023-03-23 16:46:20 +08:00
Yuanchen	9998d5ef64	[chatgpt]add reward model code for deberta (#3199 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2023-03-22 19:09:39 +08:00
Fazzie-Maqianli	1e1b9d2fea	[chatgpt]support llama (#3070 )	2023-03-22 15:44:31 +08:00
pgzhang	b429529365	[chatgpt] add supervised learning fine-tune code (#3183 ) * [chatgpt] add supervised fine-tune code * [chatgpt] delete unused code and modified comment code * [chatgpt] use pytorch distributed sampler instead --------- Co-authored-by: zhangpengpeng <zhangpengpeng@joyy.com>	2023-03-22 09:59:42 +08:00
BlueRum	7548ca5a54	[chatgpt]Reward Model Training Process update (#3133 ) * add normalize function to value_head in bloom rm * add normalization to value_function in gpt_rm * add normalization to value_head of opt_rm * add Anthropic/hh-rlhf dataset * Update __init__.py * Add LogExpLoss in RM training * Update __init__.py * update rm trainer to use acc as target * update example/train_rm * Update train_rm.sh * code style * Update README.md * Update README.md * add rm test to ci * fix tokenier * fix typo * change batchsize to avoid oom in ci * Update test_ci.sh	2023-03-20 09:59:06 +08:00
ver217	1e58d31bb7	[chatgpt] fix trainer generate kwargs (#3166 )	2023-03-17 17:31:22 +08:00
ver217	c474fda282	[chatgpt] fix ppo training hanging problem with gemini (#3162 ) * [chatgpt] fix generation early stopping * [chatgpt] fix train prompts example	2023-03-17 15:41:47 +08:00
binmakeswell	3c01280a56	[doc] add community contribution guide (#3153 ) * [doc] update contribution guide * [doc] update contribution guide * [doc] add community contribution guide	2023-03-17 11:07:24 +08:00
BlueRum	23cd5e2ccf	[chatgpt]update ci (#3087 ) * [chatgpt]update ci * Update test_ci.sh * Update test_ci.sh * Update test_ci.sh * test * Update train_prompts.py * Update train_dummy.py * add save_path * polish * add save path * polish * add save path * polish * delete bloom-560m test delete bloom-560m test because of oom * add ddp test	2023-03-14 11:01:17 +08:00
BlueRum	68577fbc43	[chatgpt]Fix examples (#3116 ) * fix train_dummy * fix train-prompts	2023-03-13 11:12:22 +08:00
BlueRum	0672b5afac	[chatgpt] fix lora support for gpt (#3113 ) * fix gpt-actor * fix gpt-critic * fix opt-critic	2023-03-13 10:37:41 +08:00
hiko2MSP	191daf7411	[chatgpt] type miss of kwargs (#3107 )	2023-03-13 00:00:02 +08:00
BlueRum	c9dd036592	[chatgpt] fix lora save bug (#3099 ) * fix colo-stratergy * polish * fix lora * fix ddp * polish * polish	2023-03-10 17:58:10 +08:00
Fazzie-Maqianli	02ae80bf9c	[chatgpt]add flag of action mask in critic(#3086 )	2023-03-10 14:40:14 +08:00
wenjunyang	b51bfec357	[chatgpt] change critic input as state (#3042 ) * fix Critic * fix Critic * fix Critic * fix neglect of attention mask * fix neglect of attention mask * fix neglect of attention mask * add return --------- Co-authored-by: yangwenjun <yangwenjun@soyoung.com> Co-authored-by: yangwjd <yangwjd@chanjet.com>	2023-03-08 15:18:02 +08:00
Fazzie-Maqianli	c21b11edce	change nn to models (#3032 )	2023-03-07 16:34:22 +08:00
github-actions[bot]	e86d9bb2e1	[format] applied code formatting on changed files in pull request 3025 (#3026 ) Co-authored-by: github-actions <github-actions@github.com>	2023-03-07 12:55:17 +08:00
BlueRum	55dcd3051a	[chatgpt] fix readme (#3025 )	2023-03-07 10:21:25 +08:00
LuGY	287d60499e	[chatgpt] Add saving ckpt callback for PPO (#2880 ) * add checkpoint callback for chatgpt * add save ckpt callbacks for ppo --------- Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>	2023-03-07 10:13:25 +08:00
BlueRum	e588703454	[chatgpt]fix inference model load (#2988 ) * fix lora bug * polish * fix lora gemini * fix inference laod model bug	2023-03-07 09:17:52 +08:00
ver217	0ff8406b00	[chatgpt] allow shard init and display warning (#2986 )	2023-03-03 16:27:59 +08:00
BlueRum	f5ca0397dd	[chatgpt] fix lora gemini conflict in RM training (#2984 ) * fix lora bug * polish * fix lora gemini	2023-03-03 15:58:16 +08:00
ver217	19ad49fb3b	[chatgpt] making experience support dp (#2971 ) * [chatgpt] making experience support dp * [chatgpt] update example test ci * [chatgpt] update example test ci * [chatgpt] update example test ci * [chatgpt] update example test ci * [chatgpt] update sampler * [chatgpt] update example test ci * [chatgpt] refactor sampler * [chatgpt] update example test ci	2023-03-03 15:51:19 +08:00
BlueRum	c9e27f0d1b	[chatgpt]fix lora bug (#2974 ) * fix lora bug * polish	2023-03-02 17:51:44 +08:00
BlueRum	82149e9d1b	[chatgpt] fix inference demo loading bug (#2969 ) * [chatgpt] fix inference demo loading bug * polish	2023-03-02 16:18:33 +08:00
Fazzie-Maqianli	bbf9c827c3	[ChatGPT] fix README (#2966 ) * Update README.md * fix README * Update README.md * Update README.md --------- Co-authored-by: fastalgo <youyang@cs.berkeley.edu> Co-authored-by: BlueRum <70618399+ht-zhou@users.noreply.github.com>	2023-03-02 15:00:05 +08:00
binmakeswell	b0a8766381	[doc] fix chatgpt inference typo (#2964 )	2023-03-02 11:22:08 +08:00
BlueRum	489a9566af	[chatgpt]add inference example (#2944 ) * [chatgpt] support inference example * Create inference.sh * Update README.md * Delete inference.sh * Update inference.py	2023-03-01 13:39:39 +08:00
binmakeswell	8264cd7ef1	[doc] add env scope (#2933 )	2023-02-28 15:39:51 +08:00
BlueRum	2e16f842a9	[chatgpt]support opt & gpt for rm training (#2876 )	2023-02-22 16:58:11 +08:00
BlueRum	34ca324b0d	[chatgpt] Support saving ckpt in examples (#2846 ) * [chatgpt]fix train_rm bug with lora * [chatgpt]support colossalai strategy to train rm * fix pre-commit * fix pre-commit 2 * [chatgpt]fix rm eval typo * fix rm eval * fix pre commit * add support of saving ckpt in examples * fix single-gpu save	2023-02-22 10:00:26 +08:00
BlueRum	3eebc4dff7	[chatgpt] fix rm eval (#2829 ) * [chatgpt]fix train_rm bug with lora * [chatgpt]support colossalai strategy to train rm * fix pre-commit * fix pre-commit 2 * [chatgpt]fix rm eval typo * fix rm eval * fix pre commit	2023-02-21 11:35:45 +08:00
ver217	b6a108cb91	[chatgpt] add test checkpoint (#2797 ) * [chatgpt] add test checkpoint * [chatgpt] test checkpoint use smaller model	2023-02-20 15:22:36 +08:00
ver217	a619a190df	[chatgpt] update readme about checkpoint (#2792 ) * [chatgpt] add save/load checkpoint sample code * [chatgpt] add save/load checkpoint readme * [chatgpt] refactor save/load checkpoint readme	2023-02-17 12:43:31 +08:00
ver217	4ee311c026	[chatgpt] startegy add prepare method (#2766 ) * [chatgpt] startegy add prepare method * [chatgpt] refactor examples * [chatgpt] refactor strategy.prepare * [chatgpt] support save/load checkpoint * [chatgpt] fix unwrap actor * [chatgpt] fix unwrap actor	2023-02-17 11:27:27 +08:00
ver217	a88bc828d5	[chatgpt] disable shard init for colossalai (#2767 )	2023-02-16 20:09:34 +08:00
BlueRum	613efebc5c	[chatgpt] support colossalai strategy to train rm (#2742 ) * [chatgpt]fix train_rm bug with lora * [chatgpt]support colossalai strategy to train rm * fix pre-commit * fix pre-commit 2	2023-02-16 11:24:07 +08:00
BlueRum	648183a960	[chatgpt]fix train_rm bug with lora (#2741 )	2023-02-16 10:25:17 +08:00
CH.Li	7aacfad8af	fix typo (#2721 )	2023-02-15 14:54:53 +08:00
ver217	9c0943ecdb	[chatgpt] optimize generation kwargs (#2717 ) * [chatgpt] ppo trainer use default generate args * [chatgpt] example remove generation preparing fn * [chatgpt] benchmark remove generation preparing fn * [chatgpt] fix ci	2023-02-15 13:59:58 +08:00
binmakeswell	d4d3387f45	[doc] add open-source contribution invitation (#2714 ) * [doc] fix typo * [doc] add invitation	2023-02-15 11:08:35 +08:00
binmakeswell	94f000515b	[doc] add Quick Preview (#2706 )	2023-02-14 23:07:30 +08:00
binmakeswell	8408c852a6	[app] fix ChatGPT requirements (#2704 )	2023-02-14 22:48:15 +08:00
ver217	1b34701027	[app] add chatgpt application (#2698 )	2023-02-14 22:17:25 +08:00

1 2 3

141 Commits (2c8ae37f61f123a305f7fe66af29140fe0f68a34)