ColossalAI

Commit Graph

Author	SHA1	Message	Date
Wang Binluo	868afdb311	Dev/zero offload (#5858 ) * fix llama * fix llama	2024-06-26 16:07:06 +08:00
Wang Binluo	de3f67d128	fix llama (#5856 )	2024-06-26 10:15:13 +08:00
Wang Binluo	4c06215dce	Merge pull request #5844 from wangbluo/offload Update Qwen2 model	2024-06-20 17:07:57 +08:00
Wang Binluo	e893f88a4f	Merge branch 'dev/zero-offload' into offload	2024-06-20 17:07:24 +08:00
wangbluo	d4ff644ef3	update qwen model	2024-06-20 09:04:57 +00:00
wangbluo	dba59354d7	remove vocab_size args	2024-06-20 08:06:39 +00:00
Wang Binluo	35ef72bfd1	Merge pull request #5842 from wangbluo/dev/zero-offload update llama model	2024-06-20 15:37:03 +08:00
pre-commit-ci[bot]	351a1c269b	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2024-06-20 06:50:40 +00:00
wangbluo	b12e9a3275	update llama model	2024-06-20 06:46:25 +00:00
wangbluo	52ea64824e	remove 4d attention mask	2024-06-19 09:28:08 +00:00
pre-commit-ci[bot]	df612434c9	[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci	2024-06-14 16:27:46 +08:00
Wang Binluo	4c69e2dc91	support qwen model	2024-06-14 16:27:46 +08:00
Wenhao Chen	32e642bf40	revert: enable return_outputs when necessary	2024-06-14 16:27:46 +08:00
Wenhao Chen	856b39f69d	to: add qwen2 auto policy	2024-06-14 16:27:46 +08:00
Wenhao Chen	6fa181ebef	feat: add qwen2 to model_zoo	2024-06-14 16:27:46 +08:00
Wenhao Chen	14305c9449	test: add qwen2 shard test	2024-06-14 16:27:46 +08:00
Wenhao Chen	5512bdf1fc	fix: modify model config and add Qwen2RMSNorm	2024-06-14 16:27:46 +08:00
Wenhao Chen	5c2a47a667	feat: support qwen2 model	2024-06-14 16:27:46 +08:00
Wenhao Chen	61545fcfee	feat: add `sub_dp_size` in plugin	2024-04-01 16:02:12 +08:00
Wenhao Chen	6ceaf4f1f8	tests: add `sub_dp_group` test	2024-04-01 16:02:12 +08:00
Wenhao Chen	9291f07964	feat: add `sub_dp_group`	2024-04-01 16:02:12 +08:00
Wenhao Chen	1aaa453706	perf: use async copy to accelerate memcpy	2024-04-01 16:02:12 +08:00
Wenhao Chen	a53c8c1ade	to: remove MoE temporarily	2024-04-01 16:02:12 +08:00
Wenhao Chen	93aaa21d4a	feat: add `DataPrefetcher`	2024-04-01 16:02:12 +08:00
Wenhao Chen	a1ab2d374e	misc: add offload warning	2024-04-01 16:02:12 +08:00
Wenhao Chen	e614aa34f3	[shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508 ) * feat: add `GradientCheckpointConfig` and `PipelineGradientCheckpointConfig` * feat: apply `GradientCheckpointConfig` to policy and llama_forward * feat: move `distribute_layer` and `get_stage_index` to PipelineStageManager * fix: add optional args for `distribute_layer` and `get_stage_index` * fix: fix changed API calls * test: update llama tests * style: polish `GradientCheckpointConfig` * fix: fix pipeline utils tests	2024-04-01 11:34:58 +08:00
YeAnbang	df5e9c53cf	[ColossalChat] Update RLHF V2 (#5286 ) * Add dpo. Fix sft, ppo, lora. Refactor all * fix and tested ppo * 2 nd round refactor * add ci tests * fix ci * fix ci * fix readme, style * fix readme style * fix style, fix benchmark * reproduce benchmark result, remove useless files * rename to ColossalChat * use new image * fix ci workflow * fix ci * use local model/tokenizer for ci tests * fix ci * fix ci * fix ci * fix ci timeout * fix rm progress bar. fix ci timeout * fix ci * fix ci typo * remove 3d plugin from ci temporary * test environment * cannot save optimizer * support chat template * fix readme * fix path * test ci locally * restore build_or_pr * fix ci data path * fix benchmark * fix ci, move ci tests to 3080, disable fast tokenizer * move ci to 85 * support flash attention 2 * add all-in-one data preparation script. Fix colossal-llama2-chat chat template * add hardware requirements * move ci test data * fix save_model, add unwrap * fix missing bos * fix missing bos; support grad accumulation with gemini * fix ci * fix ci * fix ci * fix llama2 chat template config * debug sft * debug sft * fix colossalai version requirement * fix ci * add sanity check to prevent NaN loss * fix requirements * add dummy data generation script * add dummy data generation script * add dummy data generation script * add dummy data generation script * update readme * update readme * update readme and ignore * fix logger bug * support parallel_output * modify data preparation logic * fix tokenization * update lr * fix inference * run pre-commit --------- Co-authored-by: Tong Li <tong.li352711588@gmail.com>	2024-03-29 14:12:29 +08:00
Yuanheng Zhao	36c4bb2893	[Fix] Grok-1 use tokenizer from the same pretrained path (#5532 ) * [fix] use tokenizer from the same pretrained path * trust remote code	2024-03-28 16:30:04 +08:00
Insu Jang	00525f7772	[shardformer] fix pipeline forward error if custom layer distribution is used (#5189 ) * Use self.[distribute_layers\|get_stage_index] to exploit custom layer distribution * Change static methods for t5 layer distribution to member functions * Change static methods for whisper layer distribution to member functions * Replace whisper policy usage with self one * Fix test case to use non-static layer distribution methods * fix: fix typo --------- Co-authored-by: Wenhao Chen <cwher@outlook.com>	2024-03-27 13:57:00 +08:00
github-actions[bot]	e6707a6e8d	[format] applied code formatting on changed files in pull request 5510 (#5517 ) Co-authored-by: github-actions <github-actions@github.com>	2024-03-27 11:21:03 +08:00
Hongxin Liu	19e1a5cf16	[shardformer] update colo attention to support custom mask (#5510 ) * [feature] refactor colo attention (#5462) * [extension] update api * [feature] add colo attention * [feature] update sdpa * [feature] update npu attention * [feature] update flash-attn * [test] add flash attn test * [test] update flash attn test * [shardformer] update modeling to fit colo attention (#5465) * [misc] refactor folder structure * [shardformer] update llama flash-attn * [shardformer] fix llama policy * [devops] update tensornvme install * [test] update llama test * [shardformer] update colo attn kernel dispatch * [shardformer] update blip2 * [shardformer] update chatglm * [shardformer] update gpt2 * [shardformer] update gptj * [shardformer] update opt * [shardformer] update vit * [shardformer] update colo attention mask prep * [shardformer] update whisper * [test] fix shardformer tests (#5514) * [test] fix shardformer tests * [test] fix shardformer tests	2024-03-27 11:19:32 +08:00
Edenzzzz	9a3321e9f4	Merge pull request #5515 from Edenzzzz/fix_layout_convert Fix layout convertor caching	2024-03-26 19:51:02 +08:00
Edenzzzz	18edcd5368	Empty-Commit	2024-03-26 19:50:41 +08:00
Edenzzzz	61da3fbc52	fixed layout converter caching and updated tester	2024-03-26 17:22:27 +08:00
Rocky Duan	cbe34c557c	Fix ColoTensorSpec for py11 (#5440 )	2024-03-26 15:56:49 +08:00
Hongxin Liu	a7790a92e8	[devops] fix example test ci (#5504 )	2024-03-26 15:09:05 +08:00
Yuanheng Zhao	131f32a076	[fix] fix grok-1 example typo (#5506 )	2024-03-26 10:19:42 +08:00
flybird11111	0688d92e2d	[shardformer]Fix lm parallel. (#5480 ) * fix * padding vocab_size when using pipeline parallellism padding vocab_size when using pipeline parallellism fix fix * fix * fix fix fix * fix gather output * fix * fix * fix fix resize embedding fix resize embedding * fix resize embedding fix * revert * revert * revert * fix lm forward distribution * fix * test ci * fix	2024-03-25 17:21:51 +08:00
binmakeswell	34e909256c	[release] grok-1 inference benchmark (#5500 ) * [release] grok-1 inference benchmark * [release] grok-1 inference benchmark * [release] grok-1 inference benchmark * [release] grok-1 inference benchmark * [release] grok-1 inference benchmark	2024-03-25 14:42:51 +08:00
Wenhao Chen	bb0a668fee	[hotfix] set return_outputs=False in examples and polish code (#5404 ) * fix: simplify merge_batch * fix: use return_outputs=False to eliminate extra memory consumption * feat: add return_outputs warning * style: remove `return_outputs=False` as it is the default value	2024-03-25 12:31:09 +08:00
Yuanheng Zhao	5fcd7795cd	[example] update Grok-1 inference (#5495 ) * revise grok-1 example * remove unused arg in scripts * prevent re-installing torch * update readme * revert modifying colossalai requirements * add perf * trivial * add tokenizer url	2024-03-24 20:24:11 +08:00
binmakeswell	6df844b8c4	[release] grok-1 314b inference (#5490 ) * [release] grok-1 inference * [release] grok-1 inference * [release] grok-1 inference	2024-03-22 15:48:12 +08:00
Hongxin Liu	848a574c26	[example] add grok-1 inference (#5485 ) * [misc] add submodule * remove submodule * [example] support grok-1 tp inference * [example] add grok-1 inference script * [example] refactor code * [example] add grok-1 readme * [exmaple] add test ci * [exmaple] update readme	2024-03-21 18:07:22 +08:00
binmakeswell	d158fc0e64	[doc] update open-sora demo (#5479 ) * [doc] update open-sora demo * [doc] update open-sora demo * [doc] update open-sora demo	2024-03-20 16:08:41 +08:00
binmakeswell	bd998ced03	[doc] release Open-Sora 1.0 with model weights (#5468 ) * [doc] release Open-Sora 1.0 with model weights * [doc] release Open-Sora 1.0 with model weights * [doc] release Open-Sora 1.0 with model weights	2024-03-18 18:31:18 +08:00
flybird11111	5e16bf7980	[shardformer] fix gathering output when using tensor parallelism (#5431 ) * fix * padding vocab_size when using pipeline parallellism padding vocab_size when using pipeline parallellism fix fix * fix * fix fix fix * fix gather output * fix * fix * fix fix resize embedding fix resize embedding * fix resize embedding fix * revert * revert * revert	2024-03-18 15:55:11 +08:00
Hongxin Liu	f2e8b9ef9f	[devops] fix compatibility (#5444 ) * [devops] fix compatibility * [hotfix] update compatibility test on pr * [devops] fix compatibility * [devops] record duration during comp test * [test] decrease test duration * fix falcon	2024-03-13 15:24:13 +08:00
digger yu	385e85afd4	[hotfix] fix typo s/keywrods/keywords etc. (#5429 )	2024-03-12 11:25:16 +08:00
Camille Zhong	da885ed540	fix tensor data update for gemini loss caluculation (#5442 )	2024-03-11 13:49:58 +08:00
Hongxin Liu	8020f42630	[release] update version (#5411 )	2024-03-07 23:36:07 +08:00

1 2 3 4 5 ...

3073 Commits (868afdb31191ef7b3fa48d6fa71e7758c8707786) All Branches Search

3073 Commits (868afdb31191ef7b3fa48d6fa71e7758c8707786)

All Branches