oahzxl
c4b15661d7
[autochunk] add benchmark for transformer and alphafold ( #2543 )
2023-02-02 15:06:43 +08:00
oahzxl
05671fcb42
[autochunk] support multi outputs chunk search ( #2538 )
...
Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy.
1. rewrite search strategy to support multi outputs chunk search
2. fix many, many bugs
3. update tests
2023-02-01 13:18:51 +08:00
oahzxl
63199c6687
[autochunk] support transformer ( #2526 )
2023-01-31 16:00:06 +08:00
Frank Lee
b55deb0662
[workflow] only report coverage for changed files ( #2524 )
...
* [workflow] only report coverage for changed files
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
* polish file
2023-01-30 21:28:27 +08:00
HELSON
b528eea0f0
[zero] add zero wrappers ( #2523 )
...
* [zero] add zero wrappers
* change names
* add wrapper functions to init
2023-01-29 17:52:58 +08:00
HELSON
077a5cdde4
[zero] fix gradient clipping in hybrid parallelism ( #2521 )
...
* [zero] fix gradient clipping in hybrid parallelism
* [testing] change model name to avoid pytest warning
* [hotfix] fix unit testing
2023-01-29 15:09:57 +08:00
HELSON
707b11d4a0
[gemini] update ddp strict mode ( #2518 )
...
* [zero] add strict ddp mode for chunk init
* [gemini] update gpt example
2023-01-28 14:35:25 +08:00
HELSON
2d1a7dfe5f
[zero] add strict ddp mode ( #2508 )
...
* [zero] add strict ddp mode
* [polish] add comments for strict ddp mode
* [zero] fix test error
2023-01-20 14:04:38 +08:00
oahzxl
c04f183237
[autochunk] support parsing blocks ( #2506 )
2023-01-20 11:18:17 +08:00
oahzxl
72341e65f4
[auto-chunk] support extramsa ( #3 ) ( #2504 )
2023-01-20 10:13:03 +08:00
oahzxl
ecccc91f21
[autochunk] support autochunk on evoformer ( #2497 )
2023-01-19 11:41:00 +08:00
HELSON
d565a24849
[zero] add unit testings for hybrid parallelism ( #2486 )
2023-01-18 10:36:10 +08:00
oahzxl
4953b4ace1
[autochunk] support evoformer tracer ( #2485 )
...
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2023-01-16 19:25:05 +08:00
YuliangLiu0306
67e1912b59
[autoparallel] support origin activation ckpt on autoprallel system ( #2468 )
2023-01-16 16:25:13 +08:00
HELSON
21c88220ce
[zero] add unit test for low-level zero init ( #2474 )
2023-01-15 10:42:01 +08:00
HELSON
a5dc4253c6
[zero] polish low level optimizer ( #2473 )
2023-01-13 14:56:17 +08:00
Jiarui Fang
867c8c2d3a
[zero] low level optim supports ProcessGroup ( #2464 )
2023-01-13 10:05:58 +08:00
YuliangLiu0306
8221fd7485
[autoparallel] update binary elementwise handler ( #2451 )
...
* [autoparallel] update binary elementwise handler
* polish
2023-01-12 09:35:10 +08:00
HELSON
5521af7877
[zero] fix state_dict and load_state_dict for ddp ignored parameters ( #2443 )
...
* [ddp] add is_ddp_ignored
[ddp] rename to is_ddp_ignored
* [zero] fix state_dict and load_state_dict
* fix bugs
* [zero] update unit test for ZeroDDP
2023-01-11 14:55:41 +08:00
YuliangLiu0306
41429b9b28
[autoparallel] add shard option ( #2423 )
2023-01-11 13:40:33 +08:00
HELSON
bb4e9a311a
[zero] add inference mode and its unit test ( #2418 )
2023-01-11 10:07:37 +08:00
oahzxl
61fdd3464a
update doc
2023-01-10 12:29:09 +08:00
oahzxl
36ab2cb783
change import
2023-01-10 12:20:40 +08:00
oahzxl
7ab2db206f
adapt new fx
2023-01-10 11:56:00 +08:00
oahzxl
e532679c95
Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk
2023-01-10 11:29:01 +08:00
oahzxl
c1492e5013
add test in import
2023-01-10 11:20:28 +08:00
HELSON
ea13a201bb
[polish] polish code for get_static_torch_model ( #2405 )
...
* [gemini] polish code
* [testing] remove code
* [gemini] make more robust
2023-01-09 17:41:38 +08:00
oahzxl
212b5b1b5f
add comments
2023-01-09 16:29:33 +08:00
oahzxl
aafc3516a5
add available
2023-01-09 15:32:19 +08:00
oahzxl
d5c4f0bf95
code style
2023-01-09 15:22:09 +08:00
oahzxl
d106b271f8
add chunk search test
2023-01-09 15:19:08 +08:00
oahzxl
a005965d2d
update codegen test
2023-01-09 14:57:47 +08:00
oahzxl
3abbaf8bc6
update codegen test
2023-01-09 14:53:04 +08:00
oahzxl
74b81395a2
update codegen test
2023-01-09 14:26:22 +08:00
oahzxl
18a51c87fe
rename test
2023-01-09 14:20:54 +08:00
oahzxl
cb68ee864a
set benchmark
2023-01-09 14:20:41 +08:00
Jiarui Fang
4e96039649
[device] find best logical mesh
2023-01-07 14:04:30 +08:00
Frank Lee
40d376c566
[setup] support pre-build and jit-build of cuda kernels ( #2374 )
...
* [setup] support pre-build and jit-build of cuda kernels
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
2023-01-06 20:50:26 +08:00
oahzxl
a6cdbf9161
seperate trace flow
2023-01-06 17:24:23 +08:00
oahzxl
da4076846d
rename
2023-01-06 17:09:37 +08:00
oahzxl
fd87d78a28
rename ambiguous variable
2023-01-06 14:28:04 +08:00
oahzxl
8a634af2f5
close mem and code print
2023-01-06 14:19:45 +08:00
oahzxl
1a6d2a740b
take apart chunk code gen
2023-01-06 14:14:45 +08:00
HELSON
48d33b1b17
[gemini] add get static torch model ( #2356 )
2023-01-06 13:41:19 +08:00
oahzxl
d1f0773182
rename
2023-01-06 11:48:33 +08:00
oahzxl
06a5355d98
update test
2023-01-06 11:44:01 +08:00
oahzxl
efb1c64c30
restruct dir
2023-01-06 11:39:26 +08:00
YuliangLiu0306
b5a3a4a65f
[device] find best logical mesh
2023-01-05 17:21:29 +08:00
YuliangLiu0306
9c9246c0d9
[device] alpha beta profiler ( #2311 )
...
* [device] alpha beta profiler
* add usage
* fix variable name
2023-01-05 16:39:55 +08:00
Jiarui Fang
db6eea3583
[builder] reconfig op_builder for pypi install ( #2314 )
2023-01-04 16:32:32 +08:00