ColossalAI

Commit Graph

Author	SHA1	Message	Date
ver217	a203b709d5	[hotfix] fix init context (#1543 ) * fix init context * fix lazy init ctx	2022-09-06 11:45:08 +08:00
Frank Lee	2cc1175c76	[fx] tested the complete workflow for auto-parallel (#1336 ) * [fx] tested the complete workflow for auto-parallel * polish code * polish code * polish code	2022-07-20 10:45:17 +08:00
Frank Lee	250be4d31e	[utils] integrated colotensor with lazy init context (#1324 ) * [utils] integrated colotensor with lazy init context * polish code * polish code * polish code	2022-07-15 17:47:12 +08:00
Jiarui Fang	c92f84fcdb	[tensor] distributed checkpointing for parameters (#1240 )	2022-07-12 15:51:06 +08:00
Jiarui Fang	9bcd2fd4af	[tensor] a shorter shard and replicate spec (#1245 )	2022-07-11 15:51:48 +08:00
Jiarui Fang	3b500984b1	[tensor] fix some unittests (#1234 )	2022-07-08 14:18:30 +08:00
Jiarui Fang	f38006ea83	[checkpoint] checkpoint for ColoTensor Model (#1196 )	2022-07-06 17:22:03 +08:00
Jiarui Fang	ae7d3f4927	[refactor] move process group from _DistSpec to ColoTensor. (#1203 )	2022-07-06 16:15:16 +08:00
YuliangLiu0306	63d2a93878	[context]support arbitary module materialization. (#1193 ) * [CLI] add CLI launcher * Revert "[CLI] add CLI launcher" This reverts commit `df7e6506d4`. * [context]support arbitary module materialization. * [test]add numerical check for lazy init context.	2022-07-04 10:12:02 +08:00
YuliangLiu0306	2053e138a2	[context]use meta tensor to init model lazily. (#1187 ) * [CLI] add CLI launcher * Revert "[CLI] add CLI launcher" This reverts commit `df7e6506d4`. * [context]use meta tensor to init model lazily. * polish * make module with device kwargs bypass the normal init. * change unit test to adapt updated context.	2022-06-29 21:02:30 +08:00
Jiarui Fang	4b9bba8116	[ColoTensor] rename APIs and add output_replicate to ComputeSpec (#1168 )	2022-06-24 13:08:54 +08:00
Frank Lee	f8eec98ff5	[tensor] fixed non-serializable colo parameter during model checkpointing (#1153 )	2022-06-22 11:43:38 +08:00
Frank Lee	73ad05fc8c	[zero] added error message to handle on-the-fly import of torch Module class (#1135 ) * [zero] added error message to handle on-the-fly import of torch Module class * polish code	2022-06-20 11:24:27 +08:00
Frank Lee	2b2dc1c86b	[pipeline] refactor the pipeline module (#1087 ) * [pipeline] refactor the pipeline module * polish code	2022-06-10 11:27:38 +08:00
Frank Lee	bad5d4c0a1	[context] support lazy init of module (#1088 ) * [context] support lazy init of module * polish code	2022-06-10 10:09:48 +08:00
Frank Lee	bfdc5ccb7b	[context] maintain the context object in with statement (#1073 )	2022-06-07 10:48:45 +08:00
Jiarui Fang	49832b2344	[refactory] add nn.parallel module (#1068 )	2022-06-06 15:34:41 +08:00
Jiarui Fang	a00644079e	reorgnize colotensor directory (#1062 ) * reorgnize colotensor directory * polish code	2022-06-03 18:04:22 +08:00
Ziyue Jiang	df9dcbbff6	[Tensor] add hybrid device demo and fix bugs (#1059 )	2022-06-03 12:09:49 +08:00
Ziyue Jiang	7c530b9de2	[Tensor] add Parameter inheritance for ColoParameter (#1041 ) * add Parameter inheritance for ColoParameter * remove tricks * remove tricks * polish * polish	2022-05-30 17:23:44 +08:00
Ziyue Jiang	6c5996a56e	[Tensor] add module check and bert test (#1031 ) * add Embedding * Add bert test * polish * add check module test * polish * polish * polish * polish	2022-05-26 18:15:42 +08:00
Ziyue Jiang	32291dd73f	[Tensor] add module handler for linear (#1021 ) * add module spec for linear * polish * polish * polish	2022-05-26 11:50:44 +08:00
ver217	007ca0df92	fix colo init context (#1026 )	2022-05-25 20:41:58 +08:00
ver217	ad536e308e	[tensor] refactor colo-tensor (#992 ) * refactor colo-tensor and update linear op * polish code * polish code * update ops and unit tests * update unit tests * polish code * rename dist_spec module * polish code * polish code * remove unneeded import * fix pipelinable	2022-05-19 12:44:59 +08:00
Ziyue Jiang	d73c2b1d79	[Tensor] fix init context (#931 ) * change torch.Parameter to ColoParameter * fix post assignment for init context * polish * polish	2022-05-11 15:48:12 +08:00
Ziyue Jiang	dfc88b85ea	[Tensor] simplify named param (#928 ) * simplify ColoModulize * simplify ColoModulize * polish * polish	2022-05-11 10:54:19 +08:00
YuliangLiu0306	32a45cd7ef	[pipelinable]use pipelinable to support GPT model. (#903 ) * [CLI] add CLI launcher * Revert "[CLI] add CLI launcher" This reverts commit `df7e6506d4`. * [pipelinable]use pipelinable to support GPT model. * fix a bug caused by ShardedModel * polish * fix front func list	2022-05-11 09:23:58 +08:00
Ziyue Jiang	c195d2814c	[Tensor] add from_pretrained support and bert pretrained test (#921 ) * add from_pretrained support and test * polish * polish * polish * polish	2022-05-09 16:11:47 +08:00
Jiarui Fang	ab95ec9aea	[Tensor] init ColoParameter (#914 )	2022-05-06 12:57:14 +08:00
Jiarui Fang	d16671da75	[Tensor] initialize the ColoOptimizer (#898 ) * [Tensor] activation is an attr of ColoTensor * [Tensor] add optimizer * only detach parameters in context * polish code	2022-04-28 15:23:40 +08:00
Jiarui Fang	676f191532	[Tensor] activation is an attr of ColoTensor (#897 )	2022-04-28 14:43:22 +08:00
Jiarui Fang	26c49639d8	[Tensor] overriding paramters() for Module using ColoTensor (#889 )	2022-04-27 15:28:59 +08:00
ver217	4df6471f5d	fix import error (#880 )	2022-04-26 19:28:40 +08:00
Jiarui Fang	d01d3b8cb0	colo init context add device attr. (#866 )	2022-04-25 14:24:26 +08:00
YuliangLiu0306	c6930d8ddf	[pipelinable]use ColoTensor to replace dummy tensor. (#853 )	2022-04-24 18:31:22 +08:00
Jiarui Fang	62f059251b	[Tensor] init a tp network training unittest (#849 )	2022-04-24 16:43:44 +08:00
ver217	0dea140760	[hotfix] add deconstructor for stateful tensor (#848 ) * add deconstructor for stateful tensor * fix colo init context	2022-04-24 15:03:04 +08:00
YuliangLiu0306	35ea6e1023	[pipelinable]use pipelinable context to initialize non-pipeline model (#816 ) * [CLI] add CLI launcher * Revert "[CLI] add CLI launcher" This reverts commit `df7e6506d4`. * [pipeline]add module lazy init feature to support large model initization. * [pipeline]add to_layer_list and partition method to support arbitrary non-pp model * refactor the module structure * polish * [pipelinable]add unit test for pipelinable * polish * polish * Fix CodeFactor issues.	2022-04-24 13:03:12 +08:00
Jiarui Fang	8789850eea	Init Conext supports lazy allocate model memory (#842 )	2022-04-22 18:03:35 +08:00
Jiarui Fang	eb1b89908c	[refactor] moving InsertPostInitMethodToModuleSubClasses to utils. (#824 )	2022-04-21 16:03:18 +08:00

40 Commits (c8e9b2ad784a501c9ed4f4bf6d5943528d23be7d)