ColossalAI

Commit Graph

Author	SHA1	Message	Date
YH	80aed29cd3	[zero] Refactor ZeroContextConfig class using dataclass (#3186 )	2 years ago
YH	9d644ff09f	Fix docstr for zero statedict (#3185 )	2 years ago
ver217	823f3b9cf4	[doc] add deepspeed citation and copyright (#2996 ) * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright	2 years ago
YH	7b13f7db18	[zero] trivial zero optimizer refactoring (#2869 ) * Fix mionr grad store interface * Apply lint	2 years ago
Boyuan Yao	8e3f66a0d1	[zero] fix wrong import (#2777 )	2 years ago
Nikita Shulga	01066152f1	Don't use `torch._six` (#2775 ) * Don't use `torch._six` This is a private API which is gone after https://github.com/pytorch/pytorch/pull/94709 * Update common.py	2 years ago
YH	ae86a29e23	Refact method of grad store (#2687 )	2 years ago
HELSON	df4f020ee3	[zero1&2] only append parameters with gradients (#2681 )	2 years ago
HELSON	b528eea0f0	[zero] add zero wrappers (#2523 ) * [zero] add zero wrappers * change names * add wrapper functions to init	2 years ago
HELSON	077a5cdde4	[zero] fix gradient clipping in hybrid parallelism (#2521 ) * [zero] fix gradient clipping in hybrid parallelism * [testing] change model name to avoid pytest warning * [hotfix] fix unit testing	2 years ago
HELSON	d565a24849	[zero] add unit testings for hybrid parallelism (#2486 )	2 years ago
HELSON	a5dc4253c6	[zero] polish low level optimizer (#2473 )	2 years ago
Jiarui Fang	867c8c2d3a	[zero] low level optim supports ProcessGroup (#2464 )	2 years ago
HELSON	7829aa094e	[ddp] add is_ddp_ignored (#2434 ) [ddp] rename to is_ddp_ignored	2 years ago
HELSON	62c38e3330	[zero] polish low level zero optimizer (#2275 )	2 years ago
HELSON	a7d95b7024	[example] add zero1, zero2 example in GPT examples (#2146 ) * [example] add zero1 and zero2 for GPT * update readme in gpt example * polish code * change init value * update readme	2 years ago
Jiarui Fang	c89c66a858	[Gemini] update API of the chunkmemstatscollector. (#2129 )	2 years ago
Jiarui Fang	2938edf446	[Gemini] update the non model data record method in runtime memory tracer (#2128 )	2 years ago
Jiarui Fang	e99edfcb51	[NFC] polish comments for Chunk class (#2116 )	2 years ago
Jiarui Fang	33f4412102	[Gemini] use MemStats to store the tracing data. Seperate it from Collector. (#2084 )	2 years ago
Jiarui Fang	b3b89865e2	[Gemini] ParamOpHook -> ColoParamOpHook (#2080 )	2 years ago
HELSON	a1ce02d740	[zero] test gradient accumulation (#1964 ) * [zero] fix memory leak for zero2 * [zero] test gradient accumulation * [zero] remove grad clip test	2 years ago
Jiarui Fang	cc0ed7cf33	[Gemini] ZeROHookV2 -> GeminiZeROHook (#1972 )	2 years ago
Jiarui Fang	c4739a725a	[Gemini] polish memstats collector (#1962 )	2 years ago
Jiarui Fang	f7e276fa71	[Gemini] add GeminiAdamOptimizer (#1960 )	2 years ago
HELSON	7066dfbf82	[zero] fix memory leak for zero2 (#1955 )	2 years ago
HELSON	6e51d296f0	[zero] migrate zero1&2 (#1878 ) * add zero1&2 optimizer * rename test ditectory * rename test files * change tolerance in test	2 years ago
Zihao	20e255d4e8	MemStatsCollectorStatic (#1765 )	2 years ago
HELSON	c6a1a62636	[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786 ) * [hotfix] fix zero's incompatibility with checkpoint in torch-1.12 * [zero] add cpu shard init * [zero] add tiny example test * [colo_tensor] fix bugs for torch-1.11	2 years ago
CsRic	ea961d8fd1	[NFC] polish colossalai/zero/sharded_param/__init__.py code style (#1717 ) Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>	2 years ago
HELSON	1468e4bcfc	[zero] add constant placement policy (#1705 ) * fixes memory leak when paramter is in fp16 in ZeroDDP init. * bans chunk releasement in CUDA. Only when a chunk is about to offload, it is allowed to release. * adds a constant placement policy. With it, users can allocate a reserved caching memory space for parameters.	2 years ago
HELSON	b28991dd0a	[feature] A new ZeRO implementation (#1644 )	2 years ago
Jiarui Fang	c5d39215f6	Revert "[feature] new zero implementation (#1623 )" (#1643 ) This reverts commit `5be118f405`.	2 years ago
HELSON	5be118f405	[feature] new zero implementation (#1623 )	2 years ago
HELSON	f7f2248771	[moe] fix MoE bugs (#1628 ) * remove forced FP32 modules * correct no_shard-contexts' positions	2 years ago
ver217	c9e8ce67b8	fix move fp32 shards (#1604 )	2 years ago
Fazzie-Maqianli	06dccdde44	[NFC] polish colossalai/zero/sharded_model/reduce_scatter.py code style (#1554 )	2 years ago
ver217	821c6172e2	[utils] Impl clip_grad_norm for ColoTensor and ZeroOptimizer (#1442 )	2 years ago
ver217	6df3e19be9	[hotfix] zero optim prevents calling inner optim.zero_grad (#1422 )	2 years ago
ver217	8dced41ad0	[zero] zero optim state_dict takes only_rank_0 (#1384 ) * zero optim state_dict takes only_rank_0 * fix unit test	2 years ago
ver217	828b9e5e0d	[hotfix] fix zero optim save/load state dict (#1381 )	2 years ago
ver217	6b43c789fd	fix zero optim backward_by_grad and save/load (#1353 )	2 years ago
ver217	d068af81a3	[doc] update rst and docstring (#1351 ) * update rst * add zero docstr * fix docstr * remove fx.tracer.meta_patch * fix docstr * fix docstr * update fx rst * fix fx docstr * remove useless rst	2 years ago
ver217	ce470ba37e	[checkpoint] sharded optim save/load grad scaler (#1350 )	2 years ago
ver217	7a05367101	[hotfix] shared model returns cpu state_dict (#1328 )	2 years ago
Jiarui Fang	4165eabb1e	[hotfix] remove potiential circle import (#1307 ) * make it faster * [hotfix] remove circle import	2 years ago
ver217	a45ddf2d5f	[hotfix] fix sharded optim step and clip_grad_norm (#1226 )	2 years ago
Jiarui Fang	a444633d13	warmup ratio configration (#1192 )	2 years ago
Jiarui Fang	372f791444	[refactor] move chunk and chunkmgr to directory gemini (#1182 )	2 years ago
ver217	9e1daa63d2	[zero] sharded optim supports loading local state dict (#1170 ) * sharded optim supports loading local state dict * polish code * add unit test	2 years ago

1 2 3 4 5

207 Commits (b09adff724c2bbded1c71cc51a707f736a0e2899)