eric8607242
9880fd2cd8
Fix state_dict key missing issue of the ZeroDDP ( #2363 )
...
* Fix state_dict output for ZeroDDP duplicated parameters
* Rewrite state_dict based on get_static_torch_model
* Modify get_static_torch_model to be compatible with the lower version (ZeroDDP)
2023-01-09 14:35:14 +08:00
Frank Lee
ce08661eb1
[cli] updated installation check cli for aot/jit build ( #2395 )
2023-01-09 11:05:27 +08:00
jiaruifang
69d9180c4b
[hotfix] issue #2388
2023-01-07 18:23:02 +08:00
Jiarui Fang
4e96039649
[device] find best logical mesh
2023-01-07 14:04:30 +08:00
Jiarui Fang
8f72b6f8fb
[hotfix] fix implement error in diffusers
2023-01-07 07:56:39 +08:00
Frank Lee
40d376c566
[setup] support pre-build and jit-build of cuda kernels ( #2374 )
...
* [setup] support pre-build and jit-build of cuda kernels
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
2023-01-06 20:50:26 +08:00
1SAA
33f3023e19
[hotfix] fix implement error in diffusers
2023-01-06 18:37:18 +08:00
Jiarui Fang
12c8bf38d7
[Pipeline] Refine GPT PP Example
2023-01-06 18:03:45 +08:00
binmakeswell
a881d6d000
Revert "[NFC] polish code format" ( #2372 )
2023-01-06 16:01:09 +08:00
Ziyue Jiang
9ae9e74017
fix diff device in some partition
2023-01-06 15:59:06 +08:00
Jiarui Fang
0dcc410f57
[NFC] polish code format
2023-01-06 15:54:06 +08:00
binmakeswell
d634eae05b
Revert "[NFC] polish code format ( #2367 )" ( #2371 )
...
This reverts commit 1f8ab6f1f5
.
2023-01-06 15:52:16 +08:00
Shawn-Kong
d42aecdda1
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style ( #2368 )
2023-01-06 15:47:10 +08:00
Jiarui Fang
1aaeb596c6
[example] gpt, shard init on all processes ( #2366 )
2023-01-06 15:44:50 +08:00
binmakeswell
1f8ab6f1f5
[NFC] polish code format ( #2367 )
2023-01-06 15:34:48 +08:00
ExtremeViscent
ac0d30fe2e
[NFC] polish batch_norm_handler.py code style ( #2359 )
2023-01-06 13:41:38 +08:00
HELSON
48d33b1b17
[gemini] add get static torch model ( #2356 )
2023-01-06 13:41:19 +08:00
ziyuhuang123
7080a8edb0
[workflow]New version: Create workflow files for examples' auto check ( #2298 )
...
* [workflows]bug_repair
* [workflow]new_pr_fixing_bugs
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
2023-01-06 09:26:49 +08:00
LuGY
e11a005c02
[NFC] polish colossalai/auto_parallel/tensor_shard/utils/factory.py code style ( #2349 )
2023-01-05 21:17:42 +08:00
YuliangLiu0306
b5a3a4a65f
[device] find best logical mesh
2023-01-05 17:21:29 +08:00
yuxuan-lou
28e2d16794
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style ( #2340 )
2023-01-05 16:53:24 +08:00
YuliangLiu0306
9c9246c0d9
[device] alpha beta profiler ( #2311 )
...
* [device] alpha beta profiler
* add usage
* fix variable name
2023-01-05 16:39:55 +08:00
Maruyama_Aya
bd12a49e2a
[NFC] polish <colossalai/auto_parallel/tensor_shard/deprecated/constants.py> code style ( #2339 )
2023-01-05 16:20:54 +08:00
Zihao
35427bcab4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/unary_elementwise_handler.py code style ( #2326 )
2023-01-05 12:18:08 +08:00
Jiarui Fang
db6eea3583
[builder] reconfig op_builder for pypi install ( #2314 )
2023-01-04 16:32:32 +08:00
Junming Wu
4a79c10750
[NFC] polish colossalai/cli/benchmark/__init__.py code style ( #2308 )
2023-01-04 15:09:57 +08:00
Ofey Chan
87d2defda6
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style ( #2305 )
2023-01-04 15:09:57 +08:00
ver217
116e3d0b8f
[NFC] polish communication/p2p_v2.py code style ( #2303 )
2023-01-04 15:09:57 +08:00
xyupeng
b965585d05
[NFC] polish colossalai/amp/torch_amp/torch_amp.py code style ( #2290 )
2023-01-04 15:09:57 +08:00
Zangwei Zheng
d1e5bafcd4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style ( #2291 )
2023-01-04 15:09:57 +08:00
shenggan
950685873f
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style ( #2292 )
2023-01-04 15:09:57 +08:00
Ziheng Qin
3041014089
[NFC] polish colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py code style ( #2299 )
...
Co-authored-by: henryqin1997 <henryqin1997@gamil.com>
2023-01-04 15:09:57 +08:00
アマデウス
49715a78f0
[NFC] polish colossalai/cli/benchmark/benchmark.py code style ( #2287 )
2023-01-04 15:09:57 +08:00
Zirui Zhu
1c29b173c9
[NFC] polish colossalai/auto_parallel/tensor_shard/node_handler/getitem_handler.py code style ( #2289 )
2023-01-04 15:09:57 +08:00
Zihao
3a02b46447
[auto-parallel] refactoring ColoTracer ( #2118 )
...
* add meta_data_computing
* add checkpoint_annotation
* rename proxy.data to proxy.meta_data and add bias addition pass
* polish code
* delete meta_prop_pass invoke and rename ori_node to orig_node
* add TracerType
* unify meta data computing
* delete TracerType
* handle setitem operation
* operator.setitem
2023-01-04 14:44:22 +08:00
HELSON
5d3a2be3af
[amp] add gradient clipping for unit tests ( #2283 )
...
* [amp] add gradient clipping in unit tests
* fix bugs
2023-01-04 11:59:56 +08:00
Boyuan Yao
d45695d94e
Merge pull request #2258 from hpcaitech/debug/ckpt-autoparallel
...
[autockpt] provide option for activation checkpoint search in SPMD solver
2023-01-04 11:37:28 +08:00
Jiarui Fang
16cc8e6aa7
[builder] MOE builder ( #2277 )
2023-01-03 20:29:39 +08:00
Boyuan Yao
b904748210
[autoparallel] bypass MetaInfo when unavailable and modify BCAST_FUNC_OP metainfo ( #2293 )
...
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline
* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop
* [autoparallel] specifycomm nodes' memory cost in construct chain
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] bypass metainfo when available and modify BCAST_FUNC_OP
2023-01-03 20:28:01 +08:00
Super Daniel
8ea50d999e
[hotfix] pass a parameter. ( #2288 )
...
* [autockpt] make it work.
* [autockpt] linearize / merge shape-consistency nodes.
* [autockpt] considering parameter and optimizer weights.
* [hotfix] pass a parameter.
2023-01-03 18:05:06 +08:00
zbian
e94c79f15b
improved allgather & reducescatter for 3d
2023-01-03 17:46:08 +08:00
HELSON
62c38e3330
[zero] polish low level zero optimizer ( #2275 )
2023-01-03 17:22:34 +08:00
Ziyue Jiang
ac863a01d6
[example] add benchmark ( #2276 )
...
* add benchmark
* merge common func
* add total and avg tflops
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-03 17:20:59 +08:00
Boyuan Yao
22e947f982
[autoparallel] fix runtime apply memory estimation ( #2281 )
...
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline
* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop
* [autoparallel] specifycomm nodes' memory cost in construct chain
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
2023-01-03 17:18:07 +08:00
Super Daniel
8e8900ff3f
[autockpt] considering parameter and optimizer weights. ( #2279 )
...
* [autockpt] make it work.
* [autockpt] linearize / merge shape-consistency nodes.
* [autockpt] considering parameter and optimizer weights.
2023-01-03 16:55:49 +08:00
YuliangLiu0306
f027ef7913
[hotfix] fix fp16 optimzier bug ( #2273 )
2023-01-03 16:53:43 +08:00
YuliangLiu0306
fb87322773
[autoparallel] fix spelling error ( #2270 )
2023-01-03 16:13:00 +08:00
Jiarui Fang
af32022f74
[Gemini] fix the convert_to_torch_module bug ( #2269 )
2023-01-03 15:55:35 +08:00
Super Daniel
b0d21d0c4f
[autockpt] linearize / merge shape-consistency nodes. ( #2271 )
...
* [autockpt] make it work.
* [autockpt] linearize / merge shape-consistency nodes.
2023-01-03 14:54:22 +08:00
YuliangLiu0306
4b29112ab2
[autoparallel] gpt2 autoparallel examples ( #2267 )
...
* [autoparallel] gpt2 autoparallel examples
* polish code
* polish code
2023-01-03 14:23:33 +08:00