HELSON
2d1a7dfe5f
[zero] add strict ddp mode ( #2508 )
...
* [zero] add strict ddp mode
* [polish] add comments for strict ddp mode
* [zero] fix test error
2023-01-20 14:04:38 +08:00
oahzxl
c04f183237
[autochunk] support parsing blocks ( #2506 )
2023-01-20 11:18:17 +08:00
Super Daniel
35c0c0006e
[utils] lazy init. ( #2148 )
...
* [utils] lazy init.
* [utils] remove description.
* [utils] complete.
* [utils] finalize.
* [utils] fix names.
2023-01-20 10:49:00 +08:00
oahzxl
72341e65f4
[auto-chunk] support extramsa ( #3 ) ( #2504 )
2023-01-20 10:13:03 +08:00
Ziyue Jiang
0f02b8c6e6
add avg partition ( #2483 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-19 13:54:50 +08:00
アマデウス
99d9713b02
Revert "Update parallel_context.py ( #2408 )"
...
This reverts commit 7d5640b9db
.
2023-01-19 12:27:48 +08:00
oahzxl
ecccc91f21
[autochunk] support autochunk on evoformer ( #2497 )
2023-01-19 11:41:00 +08:00
oahzxl
5db3a5bf42
[fx] allow control of ckpt_codegen init ( #2498 )
...
* [fx] allow control of ckpt_codegen init
Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so.
So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__.
* code style
2023-01-18 17:02:46 +08:00
HELSON
d565a24849
[zero] add unit testings for hybrid parallelism ( #2486 )
2023-01-18 10:36:10 +08:00
oahzxl
4953b4ace1
[autochunk] support evoformer tracer ( #2485 )
...
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2023-01-16 19:25:05 +08:00
YuliangLiu0306
67e1912b59
[autoparallel] support origin activation ckpt on autoprallel system ( #2468 )
2023-01-16 16:25:13 +08:00
Ziyue Jiang
fef5c949c3
polish pp middleware ( #2476 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-13 16:56:01 +08:00
HELSON
a5dc4253c6
[zero] polish low level optimizer ( #2473 )
2023-01-13 14:56:17 +08:00
Frank Lee
8b7495dd54
[example] integrate seq-parallel tutorial with CI ( #2463 )
2023-01-13 14:40:05 +08:00
Jiarui Fang
867c8c2d3a
[zero] low level optim supports ProcessGroup ( #2464 )
2023-01-13 10:05:58 +08:00
Frank Lee
14d9299360
[cli] fixed hostname mismatch error ( #2465 )
2023-01-12 14:52:09 +08:00
Haofan Wang
9358262992
Fix False warning in initialize.py ( #2456 )
...
* Update initialize.py
* pre-commit run check
2023-01-12 13:49:01 +08:00
YuliangLiu0306
8221fd7485
[autoparallel] update binary elementwise handler ( #2451 )
...
* [autoparallel] update binary elementwise handler
* polish
2023-01-12 09:35:10 +08:00
HELSON
2bfeb24308
[zero] add warning for ignored parameters ( #2446 )
2023-01-11 15:30:09 +08:00
Frank Lee
39163417a1
[example] updated the hybrid parallel tutorial ( #2444 )
...
* [example] updated the hybrid parallel tutorial
* polish code
2023-01-11 15:17:17 +08:00
HELSON
5521af7877
[zero] fix state_dict and load_state_dict for ddp ignored parameters ( #2443 )
...
* [ddp] add is_ddp_ignored
[ddp] rename to is_ddp_ignored
* [zero] fix state_dict and load_state_dict
* fix bugs
* [zero] update unit test for ZeroDDP
2023-01-11 14:55:41 +08:00
YuliangLiu0306
2731531bc2
[autoparallel] integrate device mesh initialization into autoparallelize ( #2393 )
...
* [autoparallel] integrate device mesh initialization into autoparallelize
* add megatron solution
* update gpt autoparallel examples with latest api
* adapt beta value to fit the current computation cost
2023-01-11 14:03:49 +08:00
Frank Lee
c72c827e95
[cli] provided more details if colossalai run fail ( #2442 )
2023-01-11 13:56:42 +08:00
Super Daniel
c41e59e5ad
[fx] allow native ckpt trace and codegen. ( #2438 )
2023-01-11 13:49:59 +08:00
YuliangLiu0306
41429b9b28
[autoparallel] add shard option ( #2423 )
2023-01-11 13:40:33 +08:00
HELSON
7829aa094e
[ddp] add is_ddp_ignored ( #2434 )
...
[ddp] rename to is_ddp_ignored
2023-01-11 12:22:45 +08:00
HELSON
bb4e9a311a
[zero] add inference mode and its unit test ( #2418 )
2023-01-11 10:07:37 +08:00
Jiarui Fang
93f62dd152
[autochunk] add autochunk feature
2023-01-10 16:04:42 +08:00
HELSON
dddacd2d2c
[hotfix] add norm clearing for the overflow step ( #2416 )
2023-01-10 15:43:06 +08:00
oahzxl
7ab2db206f
adapt new fx
2023-01-10 11:56:00 +08:00
oahzxl
e532679c95
Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk
2023-01-10 11:29:01 +08:00
Haofan Wang
7d5640b9db
Update parallel_context.py ( #2408 )
2023-01-10 11:27:23 +08:00
oahzxl
fd818cf144
change imports
2023-01-10 11:10:45 +08:00
oahzxl
a591d45b29
add available
2023-01-10 10:56:39 +08:00
oahzxl
615e7e68d9
update doc
2023-01-10 10:44:07 +08:00
oahzxl
7d4abaa525
add doc
2023-01-10 09:59:47 +08:00
oahzxl
1be0ac3cbf
add doc for trace indice
2023-01-09 17:59:52 +08:00
oahzxl
0b6af554df
remove useless function
2023-01-09 17:46:43 +08:00
oahzxl
d914a21d64
rename
2023-01-09 17:45:36 +08:00
oahzxl
865f2e0196
rename
2023-01-09 17:42:25 +08:00
HELSON
ea13a201bb
[polish] polish code for get_static_torch_model ( #2405 )
...
* [gemini] polish code
* [testing] remove code
* [gemini] make more robust
2023-01-09 17:41:38 +08:00
oahzxl
a4ed5b0d0d
rename in doc
2023-01-09 17:41:26 +08:00
oahzxl
1bb1f2ad89
rename
2023-01-09 17:38:16 +08:00
oahzxl
cb9817f75d
rename function from index to indice
2023-01-09 17:34:30 +08:00
oahzxl
0ea903b94e
rename trace_index to trace_indice
2023-01-09 17:25:13 +08:00
Frank Lee
551cafec14
[doc] updated kernel-related optimisers' docstring ( #2385 )
...
* [doc] updated kernel-related optimisers' docstring
* polish doc
2023-01-09 17:13:53 +08:00
oahzxl
065f0b4c27
add doc for search
2023-01-09 17:11:51 +08:00
oahzxl
a68d240ed5
add doc for search chunk
2023-01-09 16:54:08 +08:00
oahzxl
1951f7fa87
code style
2023-01-09 16:30:16 +08:00
oahzxl
212b5b1b5f
add comments
2023-01-09 16:29:33 +08:00
oahzxl
19cc64b1d3
remove autochunk_available
2023-01-09 16:06:58 +08:00
eric8607242
9880fd2cd8
Fix state_dict key missing issue of the ZeroDDP ( #2363 )
...
* Fix state_dict output for ZeroDDP duplicated parameters
* Rewrite state_dict based on get_static_torch_model
* Modify get_static_torch_model to be compatible with the lower version (ZeroDDP)
2023-01-09 14:35:14 +08:00
oahzxl
4d223e18a2
fix typo
2023-01-09 13:46:17 +08:00
Frank Lee
ce08661eb1
[cli] updated installation check cli for aot/jit build ( #2395 )
2023-01-09 11:05:27 +08:00
jiaruifang
69d9180c4b
[hotfix] issue #2388
2023-01-07 18:23:02 +08:00
Jiarui Fang
4e96039649
[device] find best logical mesh
2023-01-07 14:04:30 +08:00
Jiarui Fang
8f72b6f8fb
[hotfix] fix implement error in diffusers
2023-01-07 07:56:39 +08:00
Frank Lee
40d376c566
[setup] support pre-build and jit-build of cuda kernels ( #2374 )
...
* [setup] support pre-build and jit-build of cuda kernels
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
2023-01-06 20:50:26 +08:00
1SAA
33f3023e19
[hotfix] fix implement error in diffusers
2023-01-06 18:37:18 +08:00
Jiarui Fang
12c8bf38d7
[Pipeline] Refine GPT PP Example
2023-01-06 18:03:45 +08:00
oahzxl
8a989a0d89
code style
2023-01-06 17:55:22 +08:00
oahzxl
c3a2bf48b4
code style
2023-01-06 17:31:59 +08:00
oahzxl
a6cdbf9161
seperate trace flow
2023-01-06 17:24:23 +08:00
oahzxl
4748967fb1
ad reorder graph
2023-01-06 17:13:18 +08:00
oahzxl
da4076846d
rename
2023-01-06 17:09:37 +08:00
oahzxl
c3d72f7db9
seperate reorder
2023-01-06 16:53:01 +08:00
binmakeswell
a881d6d000
Revert "[NFC] polish code format" ( #2372 )
2023-01-06 16:01:09 +08:00
Ziyue Jiang
9ae9e74017
fix diff device in some partition
2023-01-06 15:59:06 +08:00
Jiarui Fang
0dcc410f57
[NFC] polish code format
2023-01-06 15:54:06 +08:00
oahzxl
6685a9d022
seperate non chunk input
2023-01-06 15:53:24 +08:00
binmakeswell
d634eae05b
Revert "[NFC] polish code format ( #2367 )" ( #2371 )
...
This reverts commit 1f8ab6f1f5
.
2023-01-06 15:52:16 +08:00
oahzxl
f856611d21
seperate prepose_nodes
2023-01-06 15:47:17 +08:00
Shawn-Kong
d42aecdda1
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style ( #2368 )
2023-01-06 15:47:10 +08:00
Jiarui Fang
1aaeb596c6
[example] gpt, shard init on all processes ( #2366 )
2023-01-06 15:44:50 +08:00
oahzxl
f4a1607e56
seperate input node dim search
2023-01-06 15:36:17 +08:00
binmakeswell
1f8ab6f1f5
[NFC] polish code format ( #2367 )
2023-01-06 15:34:48 +08:00
oahzxl
ae27a8b26d
seperate flow tracer
2023-01-06 14:57:33 +08:00
oahzxl
fd87d78a28
rename ambiguous variable
2023-01-06 14:28:04 +08:00
oahzxl
2bde9d2b7f
code format
2023-01-06 14:21:49 +08:00
oahzxl
8a634af2f5
close mem and code print
2023-01-06 14:19:45 +08:00
oahzxl
1a6d2a740b
take apart chunk code gen
2023-01-06 14:14:45 +08:00
ExtremeViscent
ac0d30fe2e
[NFC] polish batch_norm_handler.py code style ( #2359 )
2023-01-06 13:41:38 +08:00
HELSON
48d33b1b17
[gemini] add get static torch model ( #2356 )
2023-01-06 13:41:19 +08:00
oahzxl
efb1c64c30
restruct dir
2023-01-06 11:39:26 +08:00
ziyuhuang123
7080a8edb0
[workflow]New version: Create workflow files for examples' auto check ( #2298 )
...
* [workflows]bug_repair
* [workflow]new_pr_fixing_bugs
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
2023-01-06 09:26:49 +08:00
LuGY
e11a005c02
[NFC] polish colossalai/auto_parallel/tensor_shard/utils/factory.py code style ( #2349 )
2023-01-05 21:17:42 +08:00
YuliangLiu0306
b5a3a4a65f
[device] find best logical mesh
2023-01-05 17:21:29 +08:00
yuxuan-lou
28e2d16794
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style ( #2340 )
2023-01-05 16:53:24 +08:00
YuliangLiu0306
9c9246c0d9
[device] alpha beta profiler ( #2311 )
...
* [device] alpha beta profiler
* add usage
* fix variable name
2023-01-05 16:39:55 +08:00
Maruyama_Aya
bd12a49e2a
[NFC] polish <colossalai/auto_parallel/tensor_shard/deprecated/constants.py> code style ( #2339 )
2023-01-05 16:20:54 +08:00
Zihao
35427bcab4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/unary_elementwise_handler.py code style ( #2326 )
2023-01-05 12:18:08 +08:00
Jiarui Fang
db6eea3583
[builder] reconfig op_builder for pypi install ( #2314 )
2023-01-04 16:32:32 +08:00
Junming Wu
4a79c10750
[NFC] polish colossalai/cli/benchmark/__init__.py code style ( #2308 )
2023-01-04 15:09:57 +08:00
Ofey Chan
87d2defda6
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style ( #2305 )
2023-01-04 15:09:57 +08:00
ver217
116e3d0b8f
[NFC] polish communication/p2p_v2.py code style ( #2303 )
2023-01-04 15:09:57 +08:00
xyupeng
b965585d05
[NFC] polish colossalai/amp/torch_amp/torch_amp.py code style ( #2290 )
2023-01-04 15:09:57 +08:00
Zangwei Zheng
d1e5bafcd4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style ( #2291 )
2023-01-04 15:09:57 +08:00
shenggan
950685873f
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style ( #2292 )
2023-01-04 15:09:57 +08:00
Ziheng Qin
3041014089
[NFC] polish colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py code style ( #2299 )
...
Co-authored-by: henryqin1997 <henryqin1997@gamil.com>
2023-01-04 15:09:57 +08:00
アマデウス
49715a78f0
[NFC] polish colossalai/cli/benchmark/benchmark.py code style ( #2287 )
2023-01-04 15:09:57 +08:00