ColossalAI

Commit Graph

Author	SHA1	Message	Date
LuGY	1a49a5ea00	[zero] support shard optimizer state dict of zero (#4194 ) * support shard optimizer of zero * polish code * support sync grad manually	1 year ago
LuGY	dd7cc58299	[zero] add state dict for low level zero (#4179 ) * add state dict for zero * fix unit test * polish	1 year ago
LuGY	c668801d36	[zero] allow passing process group to zero12 (#4153 ) * allow passing process group to zero12 * union tp-zero and normal-zero * polish code	1 year ago
LuGY	79cf1b5f33	[zero]support no_sync method for zero1 plugin (#4138 ) * support no sync for zero1 plugin * polish * polish	1 year ago
LuGY	c6ab96983a	[zero] refactor low level zero for shard evenly (#4030 ) * refactor low level zero * fix zero2 and support cpu offload * avg gradient and modify unit test * refactor grad store, support layer drop * refactor bucket store, support grad accumulation * fix and update unit test of zero and ddp * compatible with tp, ga and unit test * fix memory leak and polish * add zero layer drop unittest * polish code * fix import err in unit test * support diffenert comm dtype, modify docstring style * polish code * test padding and fix * fix unit test of low level zero * fix pad recording in bucket store * support some models * polish	1 year ago
Yuanchen	5187c96b7c	support session-based training (#4313 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	1 year ago
binmakeswell	ef4b99ebcd	add llama example CI	1 year ago
yuxuan-lou	0991405361	[NFC] polish applications/Chat/coati/models/utils.py codestyle (#4277 ) * [NFC] polish colossalai/context/random/__init__.py code style * [NFC] polish applications/Chat/coati/models/utils.py code style	1 year ago
Zirui Zhu	9e512938f6	[NFC] polish applications/Chat/coati/trainer/strategies/base.py code style (#4278 )	1 year ago
Ziheng Qin	c972d65311	applications/Chat/.gitignore (#4279 ) Co-authored-by: henryqin1997 <henryqin1997@gamil.com>	1 year ago
RichardoLuo	709e121cd5	[NFC] polish applications/Chat/coati/models/generation.py code style (#4275 )	1 year ago
Yuanchen	dc1b6127f9	[NFC] polish applications/Chat/inference/server.py code style (#4274 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	1 year ago
アマデウス	caa4433072	[NFC] fix format of application/Chat/coati/trainer/utils.py (#4273 )	1 year ago
Xu Kai	1ce997daaf	[NFC] polish applications/Chat/examples/train_reward_model.py code style (#4271 )	1 year ago
dayellow	a50d39a143	[NFC] fix: format (#4270 ) * [NFC] polish colossalai/fx/profiler/experimental/profiler_module/embedding.py code style * [NFC] polish colossalai/communication/utils.py code style --------- Co-authored-by: Minghao Huang <huangminghao@luchentech.com>	1 year ago
Wenhao Chen	fee553288b	[NFC] polish runtime_preparation_pass style (#4266 )	1 year ago
YeAnbang	3883db452c	[NFC] polish unary_elementwise_generator.py code style (#4267 ) Co-authored-by: aye42 <aye42@gatech.edu>	1 year ago
shenggan	798cb72907	[NFC] polish applications/Chat/coati/trainer/base.py code style (#4260 )	1 year ago
Zheng Zangwei (Alex Zheng)	b2debdc09b	[NFC] polish applications/Chat/coati/dataset/sft_dataset.py code style (#4259 )	1 year ago
梁爽	abe4f971e0	[NFC] polish colossalai/booster/plugin/low_level_zero_plugin.py code style (#4256 ) Co-authored-by: supercooledith <893754954@qq.com>	1 year ago
Yanjia0	c614a99d28	[NFC] polish colossalai/auto_parallel/offload/amp_optimizer.py code style (#4255 )	1 year ago
ocd_with_naming	85774f0c1f	[NFC] polish colossalai/cli/benchmark/utils.py code style (#4254 )	1 year ago
CZYCW	dee1c96344	[NFC] policy applications/Chat/examples/ray/mmmt_prompt.py code style (#4250 )	1 year ago
Junming Wu	77c469e1ba	[NFC] polish applications/Chat/coati/models/base/actor.py code style (#4248 )	1 year ago
Camille Zhong	915ed8bed1	[NFC] polish applications/Chat/inference/requirements.txt code style (#4265 )	1 year ago
Michelle	86cf6aed5b	Fix/format (#4261 ) * revise shardformer readme (#4246) * [example] add llama pretraining (#4257) * [NFC] polish colossalai/communication/p2p.py code style --------- Co-authored-by: Jianghai <72591262+CjhHa1@users.noreply.github.com> Co-authored-by: binmakeswell <binmakeswell@gmail.com> Co-authored-by: Qianran Ma <qianranm@luchentech.com>	1 year ago
Jianghai	b366f1d99f	[NFC] Fix format for mixed precision (#4253 ) * [NFC] polish colossalai/booster/mixed_precision/mixed_precision_base.py code style	1 year ago
Hongxin Liu	02192a632e	[ci] support testmon core pkg change detection (#4305 )	1 year ago
Baizhou Zhang	c6f6005990	[checkpointio] Sharded Optimizer Checkpoint for Gemini Plugin (#4302 ) * sharded optimizer checkpoint for gemini plugin * modify test to reduce testing time * update doc * fix bug when keep_gatherd is true under GeminiPlugin	1 year ago
Hongxin Liu	fc5cef2c79	[lazy] support init on cuda (#4269 ) * [lazy] support init on cuda * [test] update lazy init test * [test] fix transformer version	1 year ago
Cuiqing Li	4b977541a8	[Kernels] added triton-implemented of self attention for colossal-ai (#4241 ) * added softmax kernel * added qkv_kernel * added ops * adding tests * upload tets * fix tests * debugging * debugging tests * debugging * added * fixed errors * added softmax kernel * clean codes * added tests * update tests * update tests * added attention * add * fixed pytest checking * add cuda check * fix cuda version * fix typo	1 year ago
binmakeswell	7ff11b5537	[example] add llama pretraining (#4257 )	1 year ago
Jianghai	9a4842c571	revise shardformer readme (#4246 )	1 year ago
github-actions[bot]	4e9b09c222	Automated submodule synchronization (#4217 ) Co-authored-by: github-actions <github-actions@github.com>	1 year ago
Frank Lee	c1cf752021	[docker] fixed ninja build command (#4203 ) * [docker] fixed ninja build command * polish code	1 year ago
Baizhou Zhang	58913441a1	Next commit [checkpointio] Unsharded Optimizer Checkpoint for Gemini Plugin (#4141 ) * [checkpointio] unsharded optimizer checkpoint for Gemini plugin * [checkpointio] unsharded optimizer checkpoint for Gemini using all_gather	1 year ago
Frank Lee	fee32a3b78	[docker] added ssh and rdma support for docker (#4192 )	1 year ago
Frank Lee	190a6ea9c2	[dtensor] fixed readme file name and removed deprecated file (#4162 )	1 year ago
Frank Lee	cc3cbe9f6f	[workflow] show test duration (#4159 )	1 year ago
Hongxin Liu	1908caad38	[cli] hotfix launch command for multi-nodes (#4165 )	1 year ago
digger yu	2ac24040eb	fix some typo colossalai/shardformer (#4160 )	1 year ago
github-actions[bot]	c77b3b19be	[format] applied code formatting on changed files in pull request 4152 (#4157 ) Co-authored-by: github-actions <github-actions@github.com>	1 year ago
Frank Lee	f447ca1811	[chat] removed cache file (#4155 )	1 year ago
Frank Lee	89f45eda5a	[shardformer] added development protocol for standardization (#4149 )	1 year ago
Frank Lee	1fb0d95df0	[shardformer] made tensor parallelism configurable (#4144 ) * [shardformer] made tensor parallelism configurable * polish code	1 year ago
Frank Lee	74257cb446	[shardformer] refactored some doc and api (#4137 ) * [shardformer] refactored some doc and api * polish code	1 year ago
jiangmingyan	7f9b30335b	[shardformer] write an shardformer example with bert finetuning (#4126 ) * [shardformer] add benchmark of shardformer * [shardformer] add benchmark of shardformer	1 year ago
Frank Lee	ae035d305d	[shardformer] added embedding gradient check (#4124 )	1 year ago
Frank Lee	44a190e6ac	[shardformer] import huggingface implicitly (#4101 )	1 year ago
Frank Lee	6a88bae4ec	[shardformer] integrate with data parallelism (#4103 )	1 year ago

... 10 11 12 13 14 ...

3125 Commits (5c6c5d6be316a4f4e867d0d8049b508e0d59ad6c) All Branches Search

3125 Commits (5c6c5d6be316a4f4e867d0d8049b508e0d59ad6c)

All Branches