ColossalAI

Commit Graph

Author	SHA1	Message	Date
Baizhou Zhang	21ba89cab6	[gemini] support gradient accumulation (#4869 ) * add test * fix no_sync bug in low level zero plugin * fix test * add argument for grad accum * add grad accum in backward hook for gemini * finish implementation, rewrite tests * fix test * skip stuck model in low level zero test * update doc * optimize communication & fix gradient checkpoint * modify doc * cleaning codes * update cpu adam fp16 case	2023-10-17 14:07:21 +08:00
github-actions[bot]	a41cf88e9b	[format] applied code formatting on changed files in pull request 4908 (#4918 ) Co-authored-by: github-actions <github-actions@github.com>	2023-10-17 10:48:24 +08:00
Hongxin Liu	4f68b3f10c	[kernel] support pure fp16 for cpu adam and update gemini optim tests (#4921 ) * [kernel] support pure fp16 for cpu adam (#4896) * [kernel] fix cpu adam kernel for pure fp16 and update tests (#4919) * [kernel] fix cpu adam * [test] update gemini optim test	2023-10-16 21:56:53 +08:00
Zian(Andy) Zheng	7768afbad0	Update flash_attention_patch.py To be compatible with the new change in the Transformers library, where a new argument 'padding_mask' was added to forward function of attention layer. https://github.com/huggingface/transformers/pull/25598	2023-10-16 14:00:45 +08:00
Xu Kai	611a5a80ca	[inference] Add smmoothquant for llama (#4904 ) * [inference] add int8 rotary embedding kernel for smoothquant (#4843) * [inference] add smoothquant llama attention (#4850) * add smoothquant llama attention * remove uselss code * remove useless code * fix import error * rename file name * [inference] add silu linear fusion for smoothquant llama mlp (#4853) * add silu linear * update skip condition * catch smoothquant cuda lib exception * prcocess exception for tests * [inference] add llama mlp for smoothquant (#4854) * add llama mlp for smoothquant * fix down out scale * remove duplicate lines * add llama mlp check * delete useless code * [inference] add smoothquant llama (#4861) * add smoothquant llama * fix attention accuracy * fix accuracy * add kv cache and save pretrained * refactor example * delete smooth * refactor code * [inference] add smooth function and delete useless code for smoothquant (#4895) * add smooth function and delete useless code * update datasets * remove duplicate import * delete useless file * refactor codes (#4902) * rafactor code * add license * add torch-int and smoothquant license	2023-10-16 11:28:44 +08:00
Zhongkai Zhao	a0684e7bd6	[feature] support no master weights option for low level zero plugin (#4816 ) * [feature] support no master weights for low level zero plugin * [feature] support no master weights for low level zero plugin, remove data copy when no master weights * remove data copy and typecasting when no master weights * not load weights to cpu when using no master weights * fix grad: use fp16 grad when no master weights * only do not update working param when no master weights * fix: only do not update working param when no master weights * fix: passing params in dict format in hybrid plugin * fix: remove extra params (tp_process_group) in hybrid_parallel_plugin	2023-10-13 07:57:45 +00:00
Xu Kai	77a9328304	[inference] add llama2 support (#4898 ) * add llama2 support * fix multi group bug	2023-10-13 13:09:23 +08:00
Baizhou Zhang	39f2582e98	[hotfix] fix lr scheduler bug in torch 2.0 (#4864 )	2023-10-12 14:04:24 +08:00
littsk	83b52c56cd	[feature] Add clip_grad_norm for hybrid_parallel_plugin (#4837 ) * Add clip_grad_norm for hibrid_parallel_plugin * polish code * add unittests * Move tp to a higher-level optimizer interface. * bug fix * polish code	2023-10-12 11:32:37 +08:00
Hongxin Liu	df63564184	[gemini] support amp o3 for gemini (#4872 ) * [gemini] support no reuse fp16 chunk * [gemini] support no master weight for optim * [gemini] support no master weight for gemini ddp * [test] update gemini tests * [test] update gemini tests * [plugin] update gemini plugin * [test] fix gemini checkpointio test * [test] fix gemini checkpoint io	2023-10-12 10:39:08 +08:00
ppt0011	c1fab951e7	Merge pull request #4889 from ppt0011/main [doc] add reminder for issue encountered with hybrid adam	2023-10-12 10:27:10 +08:00
littsk	ffd9a3cbc9	[hotfix] fix bug in sequence parallel test (#4887 )	2023-10-11 19:30:41 +08:00
ppt0011	1dcaf249bd	[doc] add reminder for issue encountered with hybrid adam	2023-10-11 17:51:14 +08:00
Xu Kai	fdec650bb4	fix test llama (#4884 )	2023-10-11 17:43:01 +08:00
Bin Jia	08a9f76b2f	[Pipeline Inference] Sync pipeline inference branch to main (#4820 ) * [pipeline inference] pipeline inference (#4492) * add pp stage manager as circle stage * fix a bug when create process group * add ppinfer basic framework * add micro batch manager and support kvcache-pp gpt2 fwd * add generate schedule * use mb size to control mb number * support generate with kv cache * add output, remove unused code * add test * reuse shardformer to build model * refactor some code and use the same attribute name of hf * fix review and add test for generation * remove unused file * fix CI * add cache clear * fix code error * fix typo * [Pipeline inference] Modify to tieweight (#4599) * add pp stage manager as circle stage * fix a bug when create process group * add ppinfer basic framework * add micro batch manager and support kvcache-pp gpt2 fwd * add generate schedule * use mb size to control mb number * support generate with kv cache * add output, remove unused code * add test * reuse shardformer to build model * refactor some code and use the same attribute name of hf * fix review and add test for generation * remove unused file * modify the way of saving newtokens * modify to tieweight * modify test * remove unused file * solve review * add docstring * [Pipeline inference] support llama pipeline inference (#4647) * support llama pipeline inference * remove tie weight operation * [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708) * add benchmark verbose * fix export tokens * fix benchmark verbose * add P2POp style to do p2p communication * modify schedule as p2p type when ppsize is 2 * remove unused code and add docstring * [Pipeline inference] Refactor code, add docsting, fix bug (#4790) * add benchmark script * update argparse * fix fp16 load * refactor code style * add docstring * polish code * fix test bug * [Pipeline inference] Add pipeline inference docs (#4817) * add readme doc * add a ico * Add performance * update table of contents * refactor code (#4873)	2023-10-11 11:40:06 +08:00
Camille Zhong	652adc2215	Update README.md	2023-10-10 23:19:34 +08:00
Camille Zhong	afe10a85fd	Update README.md	2023-10-10 23:19:34 +08:00
Camille Zhong	d6c4b9b370	Update main README.md add modelscope model link	2023-10-10 23:19:34 +08:00
Camille Zhong	3043d5d676	Update modelscope link in README.md add modelscope link	2023-10-10 23:19:34 +08:00
flybird11111	6a21f96a87	[doc] update advanced tutorials, training gpt with hybrid parallelism (#4866 ) * [doc]update advanced tutorials, training gpt with hybrid parallelism * [doc]update advanced tutorials, training gpt with hybrid parallelism * update vit tutorials * update vit tutorials * update vit tutorials * update vit tutorials * update en/train_vit_with_hybrid_parallel.py * fix * resolve comments * fix	2023-10-10 08:18:55 +00:00
Blagoy Simandoff	8aed02b957	[nfc] fix minor typo in README (#4846 )	2023-10-07 17:51:11 +08:00
Camille Zhong	cd6a962e66	[NFC] polish code style (#4799 )	2023-10-07 13:36:52 +08:00
Michelle	07ed155e86	[NFC] polish colossalai/inference/quant/gptq/cai_gptq/__init__.py code style (#4792 )	2023-10-07 13:36:52 +08:00
littsk	eef96e0877	polish code for gptq (#4793 )	2023-10-07 13:36:52 +08:00
Hongxin Liu	cb3a25a062	[checkpointio] hotfix torch 2.0 compatibility (#4824 )	2023-10-07 10:45:52 +08:00
ppt0011	ad23460cf8	Merge pull request #4856 from KKZ20/test/model_support_for_low_level_zero [test] remove the redundant code of model output transformation in torchrec	2023-10-06 09:32:33 +08:00
ppt0011	81ee91f2ca	Merge pull request #4858 from Shawlleyw/main [doc]: typo in document of booster low_level_zero plugin	2023-10-06 09:27:54 +08:00
shaoyuw	c97a3523db	fix: typo in comment of low_level_zero plugin	2023-10-05 16:30:34 +00:00
Zhongkai Zhao	db40e086c8	[test] modify model supporting part of low_level_zero plugin (including correspoding docs)	2023-10-05 15:10:31 +08:00
Xu Kai	d1fcc0fa4d	[infer] fix test bug (#4838 ) * fix test bug * delete useless code * fix typo	2023-10-04 10:01:03 +08:00
Jianghai	013a4bedf0	[inference]fix import bug and delete down useless init (#4830 ) * fix import bug and release useless init * fix * fix * fix	2023-10-04 09:18:45 +08:00
Yuanheng Zhao	573f270537	[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841 ) * fix imports * add ray-serve with Colossal-Infer tp * trivial: send requests script * add README * fix worker port * fix readme * use app builder and autoscaling * trivial: input args * clean code; revise readme * testci (skip example test) * use auto model/tokenizer * revert imports fix (fixed in other PRs)	2023-10-02 17:48:38 +08:00
Yuanheng Zhao	3a74eb4b3a	[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771 ) * add Colossal-Inference serving example w/ TorchServe * add dockerfile * fix dockerfile * fix dockerfile: fix commit hash, install curl * refactor file structure * revise readme * trivial * trivial: dockerfile format * clean dir; revise readme * fix comments: fix imports and configs * fix formats * remove unused requirements	2023-10-02 17:42:37 +08:00
Tong Li	ed06731e00	update Colossal (#4832 )	2023-09-28 16:05:05 +08:00
Xu Kai	c3bef20478	add autotune (#4822 )	2023-09-28 13:47:35 +08:00
binmakeswell	822051d888	[doc] update slack link (#4823 )	2023-09-27 17:37:39 +08:00
Yuanchen	1fa8c5e09f	Update Qwen-7B results (#4821 ) Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>	2023-09-27 17:33:54 +08:00
flybird11111	be400a0936	[chat] fix gemini strategy (#4698 ) * [chat] fix gemini strategy * [chat] fix gemini strategy * [chat] fix gemini strategy * [chat] fix gemini strategy * g# This is a combination of 2 commits. [chat] fix gemini strategy fox * [chat] fix gemini strategy update llama2 example [chat] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * [fix] fix gemini strategy * fix * fix * fix * fix * fix * Update train_prompts.py	2023-09-27 13:15:32 +08:00
Tong Li	bbbcac26e8	fix format (#4815 )	2023-09-27 12:50:22 +08:00
github-actions[bot]	fb46d05cdf	[format] applied code formatting on changed files in pull request 4595 (#4602 ) Co-authored-by: github-actions <github-actions@github.com>	2023-09-27 10:45:03 +08:00
littsk	11f1e426fe	[hotfix] Correct several erroneous code comments (#4794 )	2023-09-27 10:43:03 +08:00
littsk	54b3ad8924	[hotfix] fix norm type error in zero optimizer (#4795 )	2023-09-27 10:35:24 +08:00
Hongxin Liu	da15fdb9ca	[doc] add lazy init docs (#4808 )	2023-09-27 10:24:04 +08:00
Yan haixu	a22706337a	[misc] add last_epoch in CosineAnnealingWarmupLR (#4778 )	2023-09-26 14:43:46 +08:00
Chandler-Bing	b6cf0aca55	[hotfix] change llama2 Colossal-LLaMA-2 script filename (#4800 ) change filename: pretraining.py -> trainin.py there is no file named pretraing.py. wrong writing	2023-09-26 11:44:27 +08:00
Desperado-Jia	62b6af1025	Merge pull request #4805 from TongLi3701/docs/fix [doc] Update TODO in README of Colossal-LLaMA-2	2023-09-26 11:39:35 +08:00
Tong Li	8cbce6184d	update	2023-09-26 11:36:53 +08:00
Hongxin Liu	4965c0dabd	[lazy] support from_pretrained (#4801 ) * [lazy] patch from pretrained * [lazy] fix from pretrained and add tests * [devops] update ci	2023-09-26 11:04:11 +08:00
Tong Li	bd014673b0	update readme	2023-09-26 10:58:05 +08:00
Baizhou Zhang	64a08b2dc3	[checkpointio] support unsharded checkpointIO for hybrid parallel (#4774 ) * support unsharded saving/loading for model * support optimizer unsharded saving * update doc * support unsharded loading for optimizer * small fix	2023-09-26 10:58:03 +08:00

1 2 3 4 5 ...

2840 Commits (21ba89cab635e62815edb9e00d4579a435ac75e1) All Branches Search

2840 Commits (21ba89cab635e62815edb9e00d4579a435ac75e1)

All Branches