Jiarui Fang
|
867c8c2d3a
|
[zero] low level optim supports ProcessGroup (#2464)
|
2023-01-13 10:05:58 +08:00 |
YuliangLiu0306
|
8221fd7485
|
[autoparallel] update binary elementwise handler (#2451)
* [autoparallel] update binary elementwise handler
* polish
|
2023-01-12 09:35:10 +08:00 |
HELSON
|
5521af7877
|
[zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443)
* [ddp] add is_ddp_ignored
[ddp] rename to is_ddp_ignored
* [zero] fix state_dict and load_state_dict
* fix bugs
* [zero] update unit test for ZeroDDP
|
2023-01-11 14:55:41 +08:00 |
YuliangLiu0306
|
41429b9b28
|
[autoparallel] add shard option (#2423)
|
2023-01-11 13:40:33 +08:00 |
HELSON
|
bb4e9a311a
|
[zero] add inference mode and its unit test (#2418)
|
2023-01-11 10:07:37 +08:00 |
oahzxl
|
61fdd3464a
|
update doc
|
2023-01-10 12:29:09 +08:00 |
oahzxl
|
36ab2cb783
|
change import
|
2023-01-10 12:20:40 +08:00 |
oahzxl
|
7ab2db206f
|
adapt new fx
|
2023-01-10 11:56:00 +08:00 |
oahzxl
|
e532679c95
|
Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk
|
2023-01-10 11:29:01 +08:00 |
oahzxl
|
c1492e5013
|
add test in import
|
2023-01-10 11:20:28 +08:00 |
HELSON
|
ea13a201bb
|
[polish] polish code for get_static_torch_model (#2405)
* [gemini] polish code
* [testing] remove code
* [gemini] make more robust
|
2023-01-09 17:41:38 +08:00 |
oahzxl
|
212b5b1b5f
|
add comments
|
2023-01-09 16:29:33 +08:00 |
oahzxl
|
aafc3516a5
|
add available
|
2023-01-09 15:32:19 +08:00 |
oahzxl
|
d5c4f0bf95
|
code style
|
2023-01-09 15:22:09 +08:00 |
oahzxl
|
d106b271f8
|
add chunk search test
|
2023-01-09 15:19:08 +08:00 |
oahzxl
|
a005965d2d
|
update codegen test
|
2023-01-09 14:57:47 +08:00 |
oahzxl
|
3abbaf8bc6
|
update codegen test
|
2023-01-09 14:53:04 +08:00 |
oahzxl
|
74b81395a2
|
update codegen test
|
2023-01-09 14:26:22 +08:00 |
oahzxl
|
18a51c87fe
|
rename test
|
2023-01-09 14:20:54 +08:00 |
oahzxl
|
cb68ee864a
|
set benchmark
|
2023-01-09 14:20:41 +08:00 |
Jiarui Fang
|
4e96039649
|
[device] find best logical mesh
|
2023-01-07 14:04:30 +08:00 |
Frank Lee
|
40d376c566
|
[setup] support pre-build and jit-build of cuda kernels (#2374)
* [setup] support pre-build and jit-build of cuda kernels
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
|
2023-01-06 20:50:26 +08:00 |
oahzxl
|
a6cdbf9161
|
seperate trace flow
|
2023-01-06 17:24:23 +08:00 |
oahzxl
|
da4076846d
|
rename
|
2023-01-06 17:09:37 +08:00 |
oahzxl
|
fd87d78a28
|
rename ambiguous variable
|
2023-01-06 14:28:04 +08:00 |
oahzxl
|
8a634af2f5
|
close mem and code print
|
2023-01-06 14:19:45 +08:00 |
oahzxl
|
1a6d2a740b
|
take apart chunk code gen
|
2023-01-06 14:14:45 +08:00 |
HELSON
|
48d33b1b17
|
[gemini] add get static torch model (#2356)
|
2023-01-06 13:41:19 +08:00 |
oahzxl
|
d1f0773182
|
rename
|
2023-01-06 11:48:33 +08:00 |
oahzxl
|
06a5355d98
|
update test
|
2023-01-06 11:44:01 +08:00 |
oahzxl
|
efb1c64c30
|
restruct dir
|
2023-01-06 11:39:26 +08:00 |
YuliangLiu0306
|
b5a3a4a65f
|
[device] find best logical mesh
|
2023-01-05 17:21:29 +08:00 |
YuliangLiu0306
|
9c9246c0d9
|
[device] alpha beta profiler (#2311)
* [device] alpha beta profiler
* add usage
* fix variable name
|
2023-01-05 16:39:55 +08:00 |
Jiarui Fang
|
db6eea3583
|
[builder] reconfig op_builder for pypi install (#2314)
|
2023-01-04 16:32:32 +08:00 |
HELSON
|
5d3a2be3af
|
[amp] add gradient clipping for unit tests (#2283)
* [amp] add gradient clipping in unit tests
* fix bugs
|
2023-01-04 11:59:56 +08:00 |
zbian
|
e94c79f15b
|
improved allgather & reducescatter for 3d
|
2023-01-03 17:46:08 +08:00 |
YuliangLiu0306
|
fb87322773
|
[autoparallel] fix spelling error (#2270)
|
2023-01-03 16:13:00 +08:00 |
YuliangLiu0306
|
8897b8f753
|
[autoparallel] autoparallel initialize (#2238)
|
2022-12-31 01:02:14 +08:00 |
YuliangLiu0306
|
3b1b91eaf4
|
[autoparallel] record parameter attribute in colotracer (#2217)
* [autoparallel] record parameter attribute in collotracer
* [autoparallel] fix construct_meta_info bug
|
2022-12-28 19:29:08 +08:00 |
Boyuan Yao
|
24246f7aa5
|
[autoparallel] Attach input, buffer and output tensor to MetaInfo class (#2162)
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
* [autoparallel] add F.linear metainfo generator
* [autoparallel] add binary elementwise metainfo
* [fx] recover profiler
* [autoparallel] fix forward memory calculation
* [autoparallel] modify constants.py
* [autoparallel] remove redundant print
* [autoparallel] add F.conv metainfo
* [autoparallel] linear fix
* [autoparallel] memory estimation for communication actions
* [autoparallel] fix docstring
* [autoparallel] fix variables name
* [autoparallel] attach tensor to metainfo class
* [autoparallel] fix dangerous try except
* [autoparallel] attach memory cost to shape consistency node
* [autoparallel] attach shape consistency node's metainfo to the node
* [autoparallel] remove todo in shape consistency memory estimation
* [autoparallel] fix the annotation
|
2022-12-28 13:37:40 +08:00 |
YuliangLiu0306
|
78509124d3
|
[autoparallel] update getitem handler (#2207)
|
2022-12-27 19:58:32 +08:00 |
YuliangLiu0306
|
4851f2d607
|
[autoparallel] update_getattr_handler (#2193)
|
2022-12-26 21:57:39 +08:00 |
YuliangLiu0306
|
f10ce01e31
|
[autoparallel] add gpt2 performance test code (#2194)
|
2022-12-26 21:56:58 +08:00 |
HELSON
|
a3100bd50d
|
[testing] add beit model for unit testings (#2196)
* [testing] add beit model
* [beit] fix bugs
* [beit] fix bugs
* [testing] fix bugs
|
2022-12-26 17:35:36 +08:00 |
HELSON
|
2458659919
|
[zero] fix error for BEiT models (#2169)
* [zero] fix error for BEiT models
* [ColoParameter] add unpack operation for tuple arguments
* fix bugs
* fix chunkv2 unit testing
* add assertion for gradient state
|
2022-12-26 15:03:54 +08:00 |
Jiarui Fang
|
355ffb386e
|
[builder] unified cpu_optim fused_optim inferface (#2190)
|
2022-12-23 20:57:41 +08:00 |
Jiarui Fang
|
9587b080ba
|
[builder] use runtime builder for fused_optim (#2189)
|
2022-12-23 17:07:03 +08:00 |
Jiarui Fang
|
bc0e271e71
|
[buider] use builder() for cpu adam and fused optim in setup.py (#2187)
|
2022-12-23 16:05:13 +08:00 |
Jiarui Fang
|
d42afd30f8
|
[builder] runtime adam and fused_optim builder (#2184)
|
2022-12-23 14:14:21 +08:00 |
YuliangLiu0306
|
550f8f8905
|
[autoparallel] integrate_gpt_related_tests (#2134)
* [autoparallel] integrate_gpt_related_tests
* polish code
* polish code
* add GPT2Model into runtime test
|
2022-12-23 12:36:59 +08:00 |