Commit Graph

67 Commits (1f8ab6f1f55f7a26d2584361646e244a8dd0f123)

Author SHA1 Message Date
binmakeswell 1f8ab6f1f5
[NFC] polish code format (#2367) 2023-01-06 15:34:48 +08:00
ziyuhuang123 7080a8edb0
[workflow]New version: Create workflow files for examples' auto check (#2298)
* [workflows]bug_repair

* [workflow]new_pr_fixing_bugs

Co-authored-by: binmakeswell <binmakeswell@gmail.com>
2023-01-06 09:26:49 +08:00
YuliangLiu0306 9c9246c0d9
[device] alpha beta profiler (#2311)
* [device] alpha beta profiler

* add usage

* fix variable name
2023-01-05 16:39:55 +08:00
Ofey Chan 87d2defda6 [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style (#2305) 2023-01-04 15:09:57 +08:00
Zangwei Zheng d1e5bafcd4 [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style (#2291) 2023-01-04 15:09:57 +08:00
shenggan 950685873f [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style (#2292) 2023-01-04 15:09:57 +08:00
Zirui Zhu 1c29b173c9 [NFC] polish colossalai/auto_parallel/tensor_shard/node_handler/getitem_handler.py code style (#2289) 2023-01-04 15:09:57 +08:00
Boyuan Yao d45695d94e
Merge pull request #2258 from hpcaitech/debug/ckpt-autoparallel
[autockpt] provide option for activation checkpoint search in SPMD solver
2023-01-04 11:37:28 +08:00
Boyuan Yao b904748210
[autoparallel] bypass MetaInfo when unavailable and modify BCAST_FUNC_OP metainfo (#2293)
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline

* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop

* [autoparallel] specifycomm nodes' memory cost in construct chain

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] fix wrong runtime apply calculation

* [autoparallel] bypass metainfo when available and modify BCAST_FUNC_OP
2023-01-03 20:28:01 +08:00
YuliangLiu0306 fb87322773
[autoparallel] fix spelling error (#2270) 2023-01-03 16:13:00 +08:00
YuliangLiu0306 4b29112ab2
[autoparallel] gpt2 autoparallel examples (#2267)
* [autoparallel] gpt2 autoparallel examples

* polish code

* polish code
2023-01-03 14:23:33 +08:00
Super Daniel 3ccf58aa76
[autockpt] make it work. (#2257) 2023-01-02 23:37:45 +08:00
Boyuan Yao ab38aebace
[autoparallel] Hook all meta information on ResNet nodes for auto activation checkpoint (#2248)
* [autoparallel] hook node meta on graph nodes for checkpoint solver

* [autoparallel] polish code

* [autoparallel] restore some node handlers

* colossalai/auto_parallel/passes/meta_info_prop.py

* [autoparallel] remove some unused import

* [autoparallel] hook bwd_mem_out
2023-01-02 16:25:18 +08:00
YuliangLiu0306 8897b8f753
[autoparallel] autoparallel initialize (#2238) 2022-12-31 01:02:14 +08:00
Boyuan Yao d0bc5a1b34
[autoparallel] new metainfoprop based on metainfo class (#2179)
* [autoparallel] new metainfoprop to combine SPMD solver and checkpoint solver

* [autoparallel] new metainfoprop to combine SPMD solver and checkpoint solver

* [autoparallel] modify placeholder handler

* [autoparallel] modify metainfoprop

* [autoparallel] fix function typo

* [autoparallel] fix placeholder handler
2022-12-28 13:35:08 +08:00
YuliangLiu0306 78509124d3
[autoparallel] update getitem handler (#2207) 2022-12-27 19:58:32 +08:00
YuliangLiu0306 4851f2d607
[autoparallel] update_getattr_handler (#2193) 2022-12-26 21:57:39 +08:00
Boyuan Yao cfe2a9bd90
[autoparallel] memory estimation for shape consistency (#2144)
* [fx] metainfo class for auto parallel

* [fx] add unit test for linear metainfo

* [fx] fix bwd param for linear

* [fx] modify unit test

* [fx] modify unit test

* [fx] modify import

* [fx] modify import

* [fx] modify import

* [fx] move meta profiler to auto parallel

* [fx] add conv metainfo class

* [fx] restore profiler

* [fx] restore meta profiler

* [autoparallel] modify unit test

* [fx] modify unit test

* [autoparallel] add batchnorm metainfo class

* [autoparallel] fix batchnorm unit test function declaration

* [fx] restore profiler

* [fx] add relu metainfo class

* [fx] restore profiler

* [autoparallel] modify metainfo input

* [autoparallel] add pooling metainfo

* [autoparallel] add F.linear metainfo generator

* [autoparallel] add binary elementwise metainfo

* [fx] recover profiler

* [autoparallel] fix forward memory calculation

* [autoparallel] modify constants.py

* [autoparallel] remove redundant print

* [autoparallel] add F.conv metainfo

* [autoparallel] linear fix

* [autoparallel] memory estimation for communication actions

* [autoparallel] fix docstring

* [autoparallel] fix variables name
2022-12-21 10:39:37 +08:00
YuliangLiu0306 1cce6e36ca
[autoparallel] use metainfo in handler (#2149) 2022-12-20 10:31:22 +08:00
YuliangLiu0306 a3c6924deb
[autoparallel] process size nodes in runtime pass (#2130)
* [autoparallel] process size nodes in runtime pass

* polish code
2022-12-14 16:10:50 +08:00
YuliangLiu0306 536560ccc0
[autoparallel] implement softmax handler (#2132) 2022-12-14 16:09:53 +08:00
YuliangLiu0306 cd0af9f7f6
[autoparallel] gpt2lp runtimee test (#2113) 2022-12-12 18:06:40 +08:00
YuliangLiu0306 d3d4630495
[autoparallel] add sum handler (#2101) 2022-12-08 17:02:54 +08:00
YuliangLiu0306 3af7e65dea
[autoparallel] complete gpt related module search (#2097) 2022-12-08 10:04:09 +08:00
YuliangLiu0306 7f72eb0510
[autoparallel]add embedding handler (#2089)
* [autoparallel] add embedding handler

* fix bugs
2022-12-07 09:41:46 +08:00
YuliangLiu0306 0e9db368ef
[autoparallel] add tensor constructor handler (#2082) 2022-12-06 10:20:10 +08:00
YuliangLiu0306 cdf537a648
[autoparallel] add non_split linear strategy (#2078)
* [autoparallel] add non_split linear stategy

* polish
2022-12-06 10:19:33 +08:00
YuliangLiu0306 f123476666
[autoparallel] complete gpt block searching (#2065)
* [autoparallel] complete gpt block searching

* fix test
2022-12-06 10:17:10 +08:00
YuliangLiu0306 1c1fe44305
[autoparallel] adapt solver with self attention (#2037)
* [autoparallel] adapt solver with self attention

* polish code
2022-12-01 17:53:15 +08:00
YuliangLiu0306 0dbcd4a6f5
[autoparallel] add split handler (#2032)
* [autoparallel] add split handler

* add numerical test and runtime passes
2022-11-29 11:03:51 +08:00
YuliangLiu0306 81330b0352
[autoparallel] add experimental permute handler (#2029) 2022-11-27 20:26:52 +08:00
YuliangLiu0306 ea0f6b8df9
[autoparallel] add runtime pass and numerical test for view handler (#2018) 2022-11-25 15:50:16 +08:00
YuliangLiu0306 1438993113
[autoparallel] add experimental view handler (#2011)
* [autoparallel] add experimental view handler

* polish

* polish

* polish code

* rename variables
2022-11-24 11:34:41 +08:00
YuliangLiu0306 155891113e
[autoparallel] use pytree map style to process data (#1989) 2022-11-21 10:44:22 +08:00
YuliangLiu0306 35e6b9ec82
[autoparallel] adapt handlers with attention block (#1990)
* [autoparallel] adapt handlers with attention block

* polish
2022-11-21 10:44:11 +08:00
YuliangLiu0306 05020e50d0
[autoparallel] support more flexible data type (#1967) 2022-11-18 17:01:06 +08:00
YuliangLiu0306 0da1d00399
[autoparallel] support distributed dataloader option (#1906)
* [autoparallel] support distributed dataloader option

* update output handler to support ddp dataloader

* poish code
2022-11-17 20:11:53 +08:00
YuliangLiu0306 fea3cb661c
[autoparallel] support addmm in tracer and solver (#1961)
* [fx] patch addmm

* [autoparallel] support addmm in tracer and solver
2022-11-16 14:59:18 +08:00
YuliangLiu0306 36c0f3ea5b
[autoparallel] remove redundancy comm node (#1893) 2022-11-15 10:53:41 +08:00
YuliangLiu0306 1b494ad73c
[autoparallel] fix linear logical convert issue (#1857) 2022-11-10 17:19:22 +08:00
HELSON 72c9448920 [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/operator_handler.py code style (#1845) 2022-11-09 12:08:47 +08:00
Sze-qq 95ac4f88ea [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/conv_handler.py code style (#1829)
Co-authored-by: siqi <siqi@siqis-MacBook-Pro.local>
2022-11-09 12:08:47 +08:00
binmakeswell 3c3714fc2a [NFC] polish strategies_constructor.py code style (#1806) 2022-11-09 12:08:47 +08:00
YuliangLiu0306 49216d7ab1
[autoparallel] fix bugs caused by negative dim key (#1808)
* [autoparallel] fix bugs caused by negative dim key

* fix import error

* fix matmul test issue

* fix unit test issue
2022-11-08 17:03:50 +08:00
YuliangLiu0306 f6032ddb17
[autoparallel] fix bias addition module (#1800) 2022-11-08 16:21:25 +08:00
YuliangLiu0306 e34e850a4c
[autoparallel]add essential CommActions for broadcast oprands (#1793) 2022-11-04 18:36:42 +08:00
Boyuan Yao 05ce3d369f
[fx] Add linear metainfo class for auto parallel (#1783)
* [fx] metainfo class for auto parallel

* [fx] add unit test for linear metainfo

* [fx] fix bwd param for linear

* [fx] modify unit test

* [fx] modify unit test

* [fx] modify import

* [fx] modify import

* [fx] modify import

* [fx] move meta profiler to auto parallel
2022-11-04 10:55:09 +08:00
YuliangLiu0306 2c4c7b3618
[autoparallel] add getattr handler (#1767)
* [autoparallel] add getattr haandler

* polish code

* add extra processes for Parameters

* add unit test for param resharding cost

* add docstring and polish test
2022-11-03 12:31:33 +08:00
Frank Lee f3f19a5c47
[autoparallel] added matmul handler (#1763)
* [autoparallel] added matmul handler

* polish code
2022-11-01 15:14:53 +08:00
YuliangLiu0306 27de252334
[autoparallel] fix conv handler numerical test (#1771) 2022-11-01 10:43:44 +08:00