ColossalAI

Commit Graph

Author	SHA1	Message	Date
HELSON	2d1a7dfe5f	[zero] add strict ddp mode (#2508 ) * [zero] add strict ddp mode * [polish] add comments for strict ddp mode * [zero] fix test error	2023-01-20 14:04:38 +08:00
oahzxl	c04f183237	[autochunk] support parsing blocks (#2506 )	2023-01-20 11:18:17 +08:00
Super Daniel	35c0c0006e	[utils] lazy init. (#2148 ) * [utils] lazy init. * [utils] remove description. * [utils] complete. * [utils] finalize. * [utils] fix names.	2023-01-20 10:49:00 +08:00
oahzxl	72341e65f4	[auto-chunk] support extramsa (#3 ) (#2504 )	2023-01-20 10:13:03 +08:00
Ziyue Jiang	0f02b8c6e6	add avg partition (#2483 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2023-01-19 13:54:50 +08:00
アマデウス	99d9713b02	Revert "Update parallel_context.py (#2408 )" This reverts commit `7d5640b9db`.	2023-01-19 12:27:48 +08:00
oahzxl	ecccc91f21	[autochunk] support autochunk on evoformer (#2497 )	2023-01-19 11:41:00 +08:00
oahzxl	5db3a5bf42	[fx] allow control of ckpt_codegen init (#2498 ) * [fx] allow control of ckpt_codegen init Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so. So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__. * code style	2023-01-18 17:02:46 +08:00
HELSON	d565a24849	[zero] add unit testings for hybrid parallelism (#2486 )	2023-01-18 10:36:10 +08:00
oahzxl	4953b4ace1	[autochunk] support evoformer tracer (#2485 ) support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it. 1. support some evoformer's op in fx 2. support evoformer test 3. add repos for test code	2023-01-16 19:25:05 +08:00
YuliangLiu0306	67e1912b59	[autoparallel] support origin activation ckpt on autoprallel system (#2468 )	2023-01-16 16:25:13 +08:00
Ziyue Jiang	fef5c949c3	polish pp middleware (#2476 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2023-01-13 16:56:01 +08:00
HELSON	a5dc4253c6	[zero] polish low level optimizer (#2473 )	2023-01-13 14:56:17 +08:00
Frank Lee	8b7495dd54	[example] integrate seq-parallel tutorial with CI (#2463 )	2023-01-13 14:40:05 +08:00
Jiarui Fang	867c8c2d3a	[zero] low level optim supports ProcessGroup (#2464 )	2023-01-13 10:05:58 +08:00
Frank Lee	14d9299360	[cli] fixed hostname mismatch error (#2465 )	2023-01-12 14:52:09 +08:00
Haofan Wang	9358262992	Fix False warning in initialize.py (#2456 ) * Update initialize.py * pre-commit run check	2023-01-12 13:49:01 +08:00
YuliangLiu0306	8221fd7485	[autoparallel] update binary elementwise handler (#2451 ) * [autoparallel] update binary elementwise handler * polish	2023-01-12 09:35:10 +08:00
HELSON	2bfeb24308	[zero] add warning for ignored parameters (#2446 )	2023-01-11 15:30:09 +08:00
Frank Lee	39163417a1	[example] updated the hybrid parallel tutorial (#2444 ) * [example] updated the hybrid parallel tutorial * polish code	2023-01-11 15:17:17 +08:00
HELSON	5521af7877	[zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443 ) * [ddp] add is_ddp_ignored [ddp] rename to is_ddp_ignored * [zero] fix state_dict and load_state_dict * fix bugs * [zero] update unit test for ZeroDDP	2023-01-11 14:55:41 +08:00
YuliangLiu0306	2731531bc2	[autoparallel] integrate device mesh initialization into autoparallelize (#2393 ) * [autoparallel] integrate device mesh initialization into autoparallelize * add megatron solution * update gpt autoparallel examples with latest api * adapt beta value to fit the current computation cost	2023-01-11 14:03:49 +08:00
Frank Lee	c72c827e95	[cli] provided more details if colossalai run fail (#2442 )	2023-01-11 13:56:42 +08:00
Super Daniel	c41e59e5ad	[fx] allow native ckpt trace and codegen. (#2438 )	2023-01-11 13:49:59 +08:00
YuliangLiu0306	41429b9b28	[autoparallel] add shard option (#2423 )	2023-01-11 13:40:33 +08:00
HELSON	7829aa094e	[ddp] add is_ddp_ignored (#2434 ) [ddp] rename to is_ddp_ignored	2023-01-11 12:22:45 +08:00
HELSON	bb4e9a311a	[zero] add inference mode and its unit test (#2418 )	2023-01-11 10:07:37 +08:00
Jiarui Fang	93f62dd152	[autochunk] add autochunk feature	2023-01-10 16:04:42 +08:00
HELSON	dddacd2d2c	[hotfix] add norm clearing for the overflow step (#2416 )	2023-01-10 15:43:06 +08:00
oahzxl	7ab2db206f	adapt new fx	2023-01-10 11:56:00 +08:00
oahzxl	e532679c95	Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk	2023-01-10 11:29:01 +08:00
Haofan Wang	7d5640b9db	Update parallel_context.py (#2408 )	2023-01-10 11:27:23 +08:00
oahzxl	fd818cf144	change imports	2023-01-10 11:10:45 +08:00
oahzxl	a591d45b29	add available	2023-01-10 10:56:39 +08:00
oahzxl	615e7e68d9	update doc	2023-01-10 10:44:07 +08:00
oahzxl	7d4abaa525	add doc	2023-01-10 09:59:47 +08:00
oahzxl	1be0ac3cbf	add doc for trace indice	2023-01-09 17:59:52 +08:00
oahzxl	0b6af554df	remove useless function	2023-01-09 17:46:43 +08:00
oahzxl	d914a21d64	rename	2023-01-09 17:45:36 +08:00
oahzxl	865f2e0196	rename	2023-01-09 17:42:25 +08:00
HELSON	ea13a201bb	[polish] polish code for get_static_torch_model (#2405 ) * [gemini] polish code * [testing] remove code * [gemini] make more robust	2023-01-09 17:41:38 +08:00
oahzxl	a4ed5b0d0d	rename in doc	2023-01-09 17:41:26 +08:00
oahzxl	1bb1f2ad89	rename	2023-01-09 17:38:16 +08:00
oahzxl	cb9817f75d	rename function from index to indice	2023-01-09 17:34:30 +08:00
oahzxl	0ea903b94e	rename trace_index to trace_indice	2023-01-09 17:25:13 +08:00
Frank Lee	551cafec14	[doc] updated kernel-related optimisers' docstring (#2385 ) * [doc] updated kernel-related optimisers' docstring * polish doc	2023-01-09 17:13:53 +08:00
oahzxl	065f0b4c27	add doc for search	2023-01-09 17:11:51 +08:00
oahzxl	a68d240ed5	add doc for search chunk	2023-01-09 16:54:08 +08:00
oahzxl	1951f7fa87	code style	2023-01-09 16:30:16 +08:00
oahzxl	212b5b1b5f	add comments	2023-01-09 16:29:33 +08:00
oahzxl	19cc64b1d3	remove autochunk_available	2023-01-09 16:06:58 +08:00
eric8607242	9880fd2cd8	Fix state_dict key missing issue of the ZeroDDP (#2363 ) * Fix state_dict output for ZeroDDP duplicated parameters * Rewrite state_dict based on get_static_torch_model * Modify get_static_torch_model to be compatible with the lower version (ZeroDDP)	2023-01-09 14:35:14 +08:00
oahzxl	4d223e18a2	fix typo	2023-01-09 13:46:17 +08:00
Frank Lee	ce08661eb1	[cli] updated installation check cli for aot/jit build (#2395 )	2023-01-09 11:05:27 +08:00
jiaruifang	69d9180c4b	[hotfix] issue #2388	2023-01-07 18:23:02 +08:00
Jiarui Fang	4e96039649	[device] find best logical mesh	2023-01-07 14:04:30 +08:00
Jiarui Fang	8f72b6f8fb	[hotfix] fix implement error in diffusers	2023-01-07 07:56:39 +08:00
Frank Lee	40d376c566	[setup] support pre-build and jit-build of cuda kernels (#2374 ) * [setup] support pre-build and jit-build of cuda kernels * polish code * polish code * polish code * polish code * polish code * polish code	2023-01-06 20:50:26 +08:00
1SAA	33f3023e19	[hotfix] fix implement error in diffusers	2023-01-06 18:37:18 +08:00
Jiarui Fang	12c8bf38d7	[Pipeline] Refine GPT PP Example	2023-01-06 18:03:45 +08:00
oahzxl	8a989a0d89	code style	2023-01-06 17:55:22 +08:00
oahzxl	c3a2bf48b4	code style	2023-01-06 17:31:59 +08:00
oahzxl	a6cdbf9161	seperate trace flow	2023-01-06 17:24:23 +08:00
oahzxl	4748967fb1	ad reorder graph	2023-01-06 17:13:18 +08:00
oahzxl	da4076846d	rename	2023-01-06 17:09:37 +08:00
oahzxl	c3d72f7db9	seperate reorder	2023-01-06 16:53:01 +08:00
binmakeswell	a881d6d000	Revert "[NFC] polish code format" (#2372 )	2023-01-06 16:01:09 +08:00
Ziyue Jiang	9ae9e74017	fix diff device in some partition	2023-01-06 15:59:06 +08:00
Jiarui Fang	0dcc410f57	[NFC] polish code format	2023-01-06 15:54:06 +08:00
oahzxl	6685a9d022	seperate non chunk input	2023-01-06 15:53:24 +08:00
binmakeswell	d634eae05b	Revert "[NFC] polish code format (#2367 )" (#2371 ) This reverts commit `1f8ab6f1f5`.	2023-01-06 15:52:16 +08:00
oahzxl	f856611d21	seperate prepose_nodes	2023-01-06 15:47:17 +08:00
Shawn-Kong	d42aecdda1	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style (#2368 )	2023-01-06 15:47:10 +08:00
Jiarui Fang	1aaeb596c6	[example] gpt, shard init on all processes (#2366 )	2023-01-06 15:44:50 +08:00
oahzxl	f4a1607e56	seperate input node dim search	2023-01-06 15:36:17 +08:00
binmakeswell	1f8ab6f1f5	[NFC] polish code format (#2367 )	2023-01-06 15:34:48 +08:00
oahzxl	ae27a8b26d	seperate flow tracer	2023-01-06 14:57:33 +08:00
oahzxl	fd87d78a28	rename ambiguous variable	2023-01-06 14:28:04 +08:00
oahzxl	2bde9d2b7f	code format	2023-01-06 14:21:49 +08:00
oahzxl	8a634af2f5	close mem and code print	2023-01-06 14:19:45 +08:00
oahzxl	1a6d2a740b	take apart chunk code gen	2023-01-06 14:14:45 +08:00
ExtremeViscent	ac0d30fe2e	[NFC] polish batch_norm_handler.py code style (#2359 )	2023-01-06 13:41:38 +08:00
HELSON	48d33b1b17	[gemini] add get static torch model (#2356 )	2023-01-06 13:41:19 +08:00
oahzxl	efb1c64c30	restruct dir	2023-01-06 11:39:26 +08:00
ziyuhuang123	7080a8edb0	[workflow]New version: Create workflow files for examples' auto check (#2298 ) * [workflows]bug_repair * [workflow]new_pr_fixing_bugs Co-authored-by: binmakeswell <binmakeswell@gmail.com>	2023-01-06 09:26:49 +08:00
LuGY	e11a005c02	[NFC] polish colossalai/auto_parallel/tensor_shard/utils/factory.py code style (#2349 )	2023-01-05 21:17:42 +08:00
YuliangLiu0306	b5a3a4a65f	[device] find best logical mesh	2023-01-05 17:21:29 +08:00
yuxuan-lou	28e2d16794	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style (#2340 )	2023-01-05 16:53:24 +08:00
YuliangLiu0306	9c9246c0d9	[device] alpha beta profiler (#2311 ) * [device] alpha beta profiler * add usage * fix variable name	2023-01-05 16:39:55 +08:00
Maruyama_Aya	bd12a49e2a	[NFC] polish <colossalai/auto_parallel/tensor_shard/deprecated/constants.py> code style (#2339 )	2023-01-05 16:20:54 +08:00
Zihao	35427bcab4	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/unary_elementwise_handler.py code style (#2326 )	2023-01-05 12:18:08 +08:00
Jiarui Fang	db6eea3583	[builder] reconfig op_builder for pypi install (#2314 )	2023-01-04 16:32:32 +08:00
Junming Wu	4a79c10750	[NFC] polish colossalai/cli/benchmark/__init__.py code style (#2308 )	2023-01-04 15:09:57 +08:00
Ofey Chan	87d2defda6	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style (#2305 )	2023-01-04 15:09:57 +08:00
ver217	116e3d0b8f	[NFC] polish communication/p2p_v2.py code style (#2303 )	2023-01-04 15:09:57 +08:00
xyupeng	b965585d05	[NFC] polish colossalai/amp/torch_amp/torch_amp.py code style (#2290 )	2023-01-04 15:09:57 +08:00
Zangwei Zheng	d1e5bafcd4	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style (#2291 )	2023-01-04 15:09:57 +08:00
shenggan	950685873f	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style (#2292 )	2023-01-04 15:09:57 +08:00
Ziheng Qin	3041014089	[NFC] polish colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py code style (#2299 ) Co-authored-by: henryqin1997 <henryqin1997@gamil.com>	2023-01-04 15:09:57 +08:00
アマデウス	49715a78f0	[NFC] polish colossalai/cli/benchmark/benchmark.py code style (#2287 )	2023-01-04 15:09:57 +08:00

1 2 3 4 5 ...

1266 Commits (4ee311c0262dfbca9b5da7e18f04dd8f1f23fe4c)