ColossalAI

Commit Graph

Author	SHA1	Message	Date
littsk	11f1e426fe	[hotfix] Correct several erroneous code comments (#4794 )	1 year ago
littsk	54b3ad8924	[hotfix] fix norm type error in zero optimizer (#4795 )	1 year ago
Hongxin Liu	da15fdb9ca	[doc] add lazy init docs (#4808 )	1 year ago
Yan haixu	a22706337a	[misc] add last_epoch in CosineAnnealingWarmupLR (#4778 )	1 year ago
Chandler-Bing	b6cf0aca55	[hotfix] change llama2 Colossal-LLaMA-2 script filename (#4800 ) change filename: pretraining.py -> trainin.py there is no file named pretraing.py. wrong writing	1 year ago
Desperado-Jia	62b6af1025	Merge pull request #4805 from TongLi3701/docs/fix [doc] Update TODO in README of Colossal-LLaMA-2	1 year ago
Tong Li	8cbce6184d	update	1 year ago
Hongxin Liu	4965c0dabd	[lazy] support from_pretrained (#4801 ) * [lazy] patch from pretrained * [lazy] fix from pretrained and add tests * [devops] update ci	1 year ago
Tong Li	bd014673b0	update readme	1 year ago
Baizhou Zhang	64a08b2dc3	[checkpointio] support unsharded checkpointIO for hybrid parallel (#4774 ) * support unsharded saving/loading for model * support optimizer unsharded saving * update doc * support unsharded loading for optimizer * small fix	1 year ago
Baizhou Zhang	a2db75546d	[doc] polish shardformer doc (#4779 ) * fix example format in docstring * polish shardformer doc	1 year ago
flybird11111	26cd6d850c	[fix] fix weekly runing example (#4787 ) * [fix] fix weekly runing example * [fix] fix weekly runing example	1 year ago
binmakeswell	d512a4d38d	[doc] add llama2 domain-specific solution news (#4789 ) * [doc] add llama2 domain-specific solution news	1 year ago
Yuanchen	ce777853ae	[feature] ColossalEval: Evaluation Pipeline for LLMs (#4786 ) * Add ColossalEval * Delete evaluate in Chat --------- Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com> Co-authored-by: Tong Li <tong.li352711588@gmail.com>	1 year ago
Tong Li	74aa7d964a	initial commit: add colossal llama 2 (#4784 )	1 year ago
Hongxin Liu	4146f1c0ce	[release] update version (#4775 ) * [release] update version * [doc] revert versions	1 year ago
Jianghai	ce7ade3882	[inference] chatglm2 infer demo (#4724 ) * add chatglm2 * add * gather needed kernels * fix some bugs * finish context forward * finish context stage * fix * add * pause * add * fix bugs * finish chatglm * fix bug * change some logic * fix bugs * change some logics * add * add * add * fix * fix tests * fix	1 year ago
Xu Kai	946ab56c48	[feature] add gptq for inference (#4754 ) * [gptq] add gptq kernel (#4416) * add gptq * refactor code * fix tests * replace auto-gptq * rname inferance/quant * refactor test * add auto-gptq as an option * reset requirements * change assert and check auto-gptq * add import warnings * change test flash attn version * remove example * change requirements of flash_attn * modify tests * [skip ci] change requirements-test * [gptq] faster gptq cuda kernel (#4494) * [skip ci] add cuda kernels * add license * [skip ci] fix max_input_len * format files & change test size * [skip ci] * [gptq] add gptq tensor parallel (#4538) * add gptq tensor parallel * add gptq tp * delete print * add test gptq check * add test auto gptq check * [gptq] combine gptq and kv cache manager (#4706) * combine gptq and kv cache manager * add init bits * delete useless code * add model path * delete usless print and update test * delete usless import * move option gptq to shard config * change replace linear to shardformer * update bloom policy * delete useless code * fix import bug and delete uselss code * change colossalai/gptq to colossalai/quant/gptq * update import linear for tests * delete useless code and mv gptq_kernel to kernel directory * fix triton kernel * add triton import	1 year ago
littsk	1e0e080837	[bug] Fix the version check bug in colossalai run when generating the cmd. (#4713 ) * Fix the version check bug in colossalai run when generating the cmd. * polish code	1 year ago
Hongxin Liu	3e05c07bb8	[lazy] support torch 2.0 (#4763 ) * [lazy] support _like methods and clamp * [lazy] pass transformers models * [lazy] fix device move and requires grad * [lazy] fix requires grad and refactor api * [lazy] fix requires grad	1 year ago
Wenhao Chen	901ab1eedd	[chat]: add lora merge weights config (#4766 ) * feat: modify lora merge weights fn * feat: add lora merge weights config	1 year ago
Baizhou Zhang	493a5efeab	[doc] add shardformer doc to sidebar (#4768 )	1 year ago
Hongxin Liu	66f3926019	[doc] clean up outdated docs (#4765 ) * [doc] clean up outdated docs * [doc] fix linking * [doc] fix linking	1 year ago
Baizhou Zhang	df66741f77	[bug] fix get_default_parser in examples (#4764 )	1 year ago
Baizhou Zhang	c0a033700c	[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758 ) * fix master param sync for hybrid plugin * rewrite unwrap for ddp/fsdp * rewrite unwrap for zero/gemini * rewrite unwrap for hybrid plugin * fix geemini unwrap * fix bugs	1 year ago
Wenhao Chen	7b9b86441f	[chat]: update rm, add wandb and fix bugs (#4471 ) * feat: modify forward fn of critic and reward model * feat: modify calc_action_log_probs * to: add wandb in sft and rm trainer * feat: update train_sft * feat: update train_rm * style: modify type annotation and add warning * feat: pass tokenizer to ppo trainer * to: modify trainer base and maker base * feat: add wandb in ppo trainer * feat: pass tokenizer to generate * test: update generate fn tests * test: update train tests * fix: remove action_mask * feat: remove unused code * fix: fix wrong ignore_index * fix: fix mock tokenizer * chore: update requirements * revert: modify make_experience * fix: fix inference * fix: add padding side * style: modify _on_learn_batch_end * test: use mock tokenizer * fix: use bf16 to avoid overflow * fix: fix workflow * [chat] fix gemini strategy * [chat] fix * sync: update colossalai strategy * fix: fix args and model dtype * fix: fix checkpoint test * fix: fix requirements * fix: fix missing import and wrong arg * fix: temporarily skip gemini test in stage 3 * style: apply pre-commit * fix: temporarily skip gemini test in stage 1&2 --------- Co-authored-by: Mingyan Jiang <1829166702@qq.com>	1 year ago
ppt0011	07c2e3d09c	Merge pull request #4757 from ppt0011/main [doc] explain suitable use case for each plugin	1 year ago
Pengtai Xu	4d7537ba25	[doc] put native colossalai plugins first in description section	1 year ago
Pengtai Xu	e10d9f087e	[doc] add model examples for each plugin	1 year ago
Pengtai Xu	a04337bfc3	[doc] put individual plugin explanation in front	1 year ago
Pengtai Xu	10513f203c	[doc] explain suitable use case for each plugin	1 year ago
Hongxin Liu	079bf3cb26	[misc] update pre-commit and run all files (#4752 ) * [misc] update pre-commit * [misc] run pre-commit * [misc] remove useless configuration files * [misc] ignore cuda for clang-format	1 year ago
github-actions[bot]	3c6b831c26	[format] applied code formatting on changed files in pull request 4743 (#4750 ) Co-authored-by: github-actions <github-actions@github.com>	1 year ago
Hongxin Liu	b5f9e37c70	[legacy] clean up legacy code (#4743 ) * [legacy] remove outdated codes of pipeline (#4692) * [legacy] remove cli of benchmark and update optim (#4690) * [legacy] remove cli of benchmark and update optim * [doc] fix cli doc test * [legacy] fix engine clip grad norm * [legacy] remove outdated colo tensor (#4694) * [legacy] remove outdated colo tensor * [test] fix test import * [legacy] move outdated zero to legacy (#4696) * [legacy] clean up utils (#4700) * [legacy] clean up utils * [example] update examples * [legacy] clean up amp * [legacy] fix amp module * [legacy] clean up gpc (#4742) * [legacy] clean up context * [legacy] clean core, constants and global vars * [legacy] refactor initialize * [example] fix examples ci * [example] fix examples ci * [legacy] fix tests * [example] fix gpt example * [example] fix examples ci * [devops] fix ci installation * [example] fix examples ci	1 year ago
Xuanlei Zhao	32e7f99416	[kernel] update triton init #4740 (#4740 )	1 year ago
Baizhou Zhang	d151dcab74	[doc] explaination of loading large pretrained models (#4741 )	1 year ago
flybird11111	4c4482f3ad	[example] llama2 add fine-tune example (#4673 ) * [shardformer] update shardformer readme [shardformer] update shardformer readme [shardformer] update shardformer readme * [shardformer] update llama2/opt finetune example and shardformer update to llama2 * [shardformer] update llama2/opt finetune example and shardformer update to llama2 * [shardformer] update llama2/opt finetune example and shardformer update to llama2 * [shardformer] change dataset * [shardformer] change dataset * [shardformer] fix CI * [shardformer] fix * [shardformer] fix * [shardformer] fix * [shardformer] fix * [shardformer] fix [example] update opt example [example] resolve comments fix fix * [example] llama2 add finetune example * [example] llama2 add finetune example * [example] llama2 add finetune example * [example] llama2 add finetune example * fix * update llama2 example * update llama2 example * fix * update llama2 example * update llama2 example * update llama2 example * update llama2 example * update llama2 example * update llama2 example * Update requirements.txt * update llama2 example * update llama2 example * update llama2 example	1 year ago
Xuanlei Zhao	ac2797996b	[shardformer] add custom policy in hybrid parallel plugin (#4718 ) * add custom policy * update assert	1 year ago
Baizhou Zhang	451c3465fb	[doc] polish shardformer doc (#4735 ) * arrange position of chapters * fix typos in seq parallel doc	1 year ago
ppt0011	73eb3e8862	Merge pull request #4738 from ppt0011/main [legacy] remove deterministic data loader test	1 year ago
Bin Jia	608cffaed3	[example] add gpt2 HybridParallelPlugin example (#4653 ) * add gpt2 HybridParallelPlugin example * update readme and testci * update test ci * fix test_ci bug * update requirements * add requirements * update requirements * add requirement * rename file	1 year ago
Bin Jia	6a03c933a0	[shardformer] update seq parallel document (#4730 ) * update doc of seq parallel * fix typo	1 year ago
Pengtai Xu	cd4e61d149	[legacy] remove deterministic data loader test	1 year ago
flybird11111	46162632e5	[shardformer] update pipeline parallel document (#4725 ) * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document	1 year ago
digger yu	e4fc57c3de	Optimized some syntax errors in the documentation and code under applications/ (#4127 ) Co-authored-by: flybird11111 <1829166702@qq.com>	1 year ago
Baizhou Zhang	50e5602c2d	[doc] add shardformer support matrix/update tensor parallel documents (#4728 ) * add compatibility matrix for shardformer doc * update tp doc	1 year ago
github-actions[bot]	8c2dda7410	[format] applied code formatting on changed files in pull request 4726 (#4727 ) Co-authored-by: github-actions <github-actions@github.com>	1 year ago
Baizhou Zhang	f911d5b09d	[doc] Add user document for Shardformer (#4702 ) * create shardformer doc files * add docstring for seq-parallel * update ShardConfig docstring * add links to llama example * add outdated massage * finish introduction & supporting information * finish 'how shardformer works' * finish shardformer.md English doc * fix doctest fail * add Chinese document	1 year ago
binmakeswell	ce97790ed7	[doc] fix llama2 code link (#4726 ) * [doc] fix llama2 code link * [doc] fix llama2 code link * [doc] fix llama2 code link	1 year ago
flybird11111	20190b49a5	[shardformer] to fix whisper test failed due to significant accuracy differences. (#4710 ) * [shardformer] fix whisper test failed * [shardformer] fix whisper test failed * [shardformer] fix whisper test failed * [shardformer] fix whisper test failed	1 year ago

1 2 3 4 5 ...

2850 Commits (8b1b237a5f7e3f7879adddd59a16d0daa5657b49) All Branches Search

2850 Commits (8b1b237a5f7e3f7879adddd59a16d0daa5657b49)

All Branches