digger yu
1baeb39c72
[NFC] fix typo with colossalai/auto_parallel/tensor_shard ( #3742 )
...
* fix typo applications/ and colossalai/ date 5.11
* fix typo colossalai/
2 years ago
wukong1992
b37797ed3d
[booster] support torch fsdp plugin in booster ( #3697 )
...
Co-authored-by: 纪少敏 <jishaomin@jishaomindeMBP.lan>
2 years ago
digger-yu
ad6460cf2c
[NFC] fix typo applications/ and colossalai/ ( #3735 )
2 years ago
digger-yu
b7141c36dd
[CI] fix some spelling errors ( #3707 )
...
* fix spelling error with examples/comminity/
* fix spelling error with tests/
* fix some spelling error with tests/ colossalai/ etc.
2 years ago
jiangmingyan
20068ba188
[booster] add tests for ddp and low level zero's checkpointio ( #3715 )
...
* [booster] update tests for booster
* [booster] update tests for booster
* [booster] update tests for booster
* [booster] update tests for booster
* [booster] update tests for booster
* [booster] update booster tutorials#3717, fix recursive check
2 years ago
Hongxin Liu
6552cbf8e1
[booster] fix no_sync method ( #3709 )
...
* [booster] fix no_sync method
* [booster] add test for ddp no_sync
* [booster] fix merge
* [booster] update unit test
* [booster] update unit test
* [booster] update unit test
2 years ago
Hongxin Liu
3bf09efe74
[booster] update prepare dataloader method for plugin ( #3706 )
...
* [booster] add prepare dataloader method for plug
* [booster] update examples and docstr
2 years ago
Hongxin Liu
f83ea813f5
[example] add train resnet/vit with booster example ( #3694 )
...
* [example] add train vit with booster example
* [example] update readme
* [example] add train resnet with booster example
* [example] enable ci
* [example] enable ci
* [example] add requirements
* [hotfix] fix analyzer init
* [example] update requirements
2 years ago
YH
2629f9717d
[tensor] Refactor handle_trans_spec in DistSpecManager
2 years ago
Hongxin Liu
d0915f54f4
[booster] refactor all dp fashion plugins ( #3684 )
...
* [booster] add dp plugin base
* [booster] inherit dp plugin base
* [booster] refactor unit tests
2 years ago
jiangmingyan
307894f74d
[booster] gemini plugin support shard checkpoint ( #3610 )
...
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin add shard checkpoint save/load
* gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
* [API Refactoring]gemini plugin support shard checkpoint
---------
Co-authored-by: luchen <luchen@luchendeMBP.lan>
Co-authored-by: luchen <luchen@luchendeMacBook-Pro.local>
2 years ago
YH
a22407cc02
[zero] Suggests a minor change to confusing variable names in the ZeRO optimizer. ( #3173 )
...
* Fix confusing variable name in zero opt
* Apply lint
* Fix util func
* Fix minor util func
* Fix zero param optimizer name
2 years ago
Hongxin Liu
50793b35f4
[gemini] accelerate inference ( #3641 )
...
* [gemini] support don't scatter after inference
* [chat] update colossalai strategy
* [chat] fix opt benchmark
* [chat] update opt benchmark
* [gemini] optimize inference
* [test] add gemini inference test
* [chat] fix unit test ci
* [chat] fix ci
* [chat] fix ci
* [chat] skip checkpoint test
2 years ago
Hongxin Liu
4b3240cb59
[booster] add low level zero plugin ( #3594 )
...
* [booster] add low level zero plugin
* [booster] fix gemini plugin test
* [booster] fix precision
* [booster] add low level zero plugin test
* [test] fix booster plugin test oom
* [test] fix booster plugin test oom
* [test] fix googlenet and inception output trans
* [test] fix diffuser clip vision model
* [test] fix torchaudio_wav2vec2_base
* [test] fix low level zero plugin test
2 years ago
digger-yu
b9a8dff7e5
[doc] Fix typo under colossalai and doc( #3618 )
...
* Fixed several spelling errors under colossalai
* Fix the spelling error in colossalai and docs directory
* Cautious Changed the spelling error under the example folder
* Update runtime_preparation_pass.py
revert autograft to autograd
* Update search_chunk.py
utile to until
* Update check_installation.py
change misteach to mismatch in line 91
* Update 1D_tensor_parallel.md
revert to perceptron
* Update 2D_tensor_parallel.md
revert to perceptron in line 73
* Update 2p5D_tensor_parallel.md
revert to perceptron in line 71
* Update 3D_tensor_parallel.md
revert to perceptron in line 80
* Update README.md
revert to resnet in line 42
* Update reorder_graph.py
revert to indice in line 7
* Update p2p.py
revert to megatron in line 94
* Update initialize.py
revert to torchrun in line 198
* Update routers.py
change to detailed in line 63
* Update routers.py
change to detailed in line 146
* Update README.md
revert random number in line 402
2 years ago
Hongxin Liu
12eff9eb4c
[gemini] state dict supports fp16 ( #3590 )
...
* [gemini] save state dict support fp16
* [gemini] save state dict shard support fp16
* [gemini] fix state dict
* [gemini] fix state dict
2 years ago
Hongxin Liu
dac127d0ee
[fx] fix meta tensor registration ( #3589 )
...
* [meta] fix torch 1.13.1
* [meta] fix torch 2.0.0
* [meta] fix torch 1.13.0
* [meta] polish code
2 years ago
Hongxin Liu
f313babd11
[gemini] support save state dict in shards ( #3581 )
...
* [gemini] support state dict shard
* [gemini] add test state dict shard
* [gemini] polish docstr
* [gemini] fix merge
* [gemini] polish code
2 years ago
YH
d329c294ec
Add docstr for zero3 chunk search utils ( #3572 )
2 years ago
Hongxin Liu
173dad0562
[misc] add verbose arg for zero and op builder ( #3552 )
...
* [misc] add print verbose
* [gemini] add print verbose
* [zero] add print verbose for low level
* [misc] add print verbose for op builder
2 years ago
Hongxin Liu
4341f5e8e6
[lazyinit] fix clone and deepcopy ( #3553 )
2 years ago
Hongxin Liu
152239bbfa
[gemini] gemini supports lazy init ( #3379 )
...
* [gemini] fix nvme optimizer init
* [gemini] gemini supports lazy init
* [gemini] add init example
* [gemini] add fool model
* [zero] update gemini ddp
* [zero] update init example
* add chunk method
* add chunk method
* [lazyinit] fix lazy tensor tolist
* [gemini] fix buffer materialization
* [misc] remove useless file
* [booster] update gemini plugin
* [test] update gemini plugin test
* [test] fix gemini plugin test
* [gemini] fix import
* [gemini] fix import
* [lazyinit] use new metatensor
* [lazyinit] use new metatensor
* [lazyinit] fix __set__ method
2 years ago
jiangmingyan
366a035552
[checkpoint] Shard saved checkpoint need to be compatible with the naming format of hf checkpoint files ( #3479 )
...
* [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format
* [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format
* [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
* [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
* [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
* [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
---------
Co-authored-by: luchen <luchen@luchendeMacBook-Pro.local>
Co-authored-by: luchen <luchen@luchendeMBP.lan>
2 years ago
YH
bcf0cbcbe7
[doc] Add docs for clip args in zero optim ( #3504 )
2 years ago
jiangmingyan
52a933e175
[checkpoint] support huggingface style sharded checkpoint ( #3461 )
...
* [checkpoint] support huggingface style sharded checkpoint
* [checkpoint] support huggingface style sharded checkpoint
* [checkpoint] support huggingface style sharded checkpoint
* [checkpoint] support huggingface style sharded checkpoint
* [checkpoint] support huggingface style sharded checkpoint
---------
Co-authored-by: luchen <luchen@luchendeMBP.lan>
2 years ago
Frank Lee
80eba05b0a
[test] refactor tests with spawn ( #3452 )
...
* [test] added spawn decorator
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
2 years ago
Frank Lee
7d8d825681
[booster] fixed the torch ddp plugin with the new checkpoint api ( #3442 )
2 years ago
YH
8f740deb53
Fix typo ( #3448 )
2 years ago
Hakjin Lee
46c009dba4
[format] Run lint on colossalai.engine ( #3367 )
2 years ago
YuliangLiu0306
ffcdbf0f65
[autoparallel]integrate auto parallel feature with new tracer ( #3408 )
...
* [autoparallel] integrate new analyzer in module level
* unify the profiling method
* polish
* fix no codegen bug
* fix pass bug
* fix liveness test
* polish
2 years ago
ver217
573af84184
[example] update examples related to zero/gemini ( #3431 )
...
* [zero] update legacy import
* [zero] update examples
* [example] fix opt tutorial
* [example] fix opt tutorial
* [example] fix opt tutorial
* [example] fix opt tutorial
* [example] fix import
2 years ago
Frank Lee
1beb85cc25
[checkpoint] refactored the API and added safetensors support ( #3427 )
...
* [checkpoint] refactored the API and added safetensors support
* polish code
2 years ago
ver217
26b7aac0be
[zero] reorganize zero/gemini folder structure ( #3424 )
...
* [zero] refactor low-level zero folder structure
* [zero] fix legacy zero import path
* [zero] fix legacy zero import path
* [zero] remove useless import
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] fix test import path
* [zero] fix test
* [zero] fix circular import
* [zero] update import
2 years ago
Frank Lee
638a07a7f9
[test] fixed gemini plugin test ( #3411 )
...
* [test] fixed gemini plugin test
* polish code
* polish code
2 years ago
ver217
5f2e34e6c9
[booster] implement Gemini plugin ( #3352 )
...
* [booster] add gemini plugin
* [booster] update docstr
* [booster] gemini plugin add coloparam convertor
* [booster] fix coloparam convertor
* [booster] fix gemini plugin device
* [booster] add gemini plugin test
* [booster] gemini plugin ignore sync bn
* [booster] skip some model
* [booster] skip some model
* [booster] modify test world size
* [booster] modify test world size
* [booster] skip test
2 years ago
HELSON
1a1d68b053
[moe] add checkpoint for moe models ( #3354 )
...
* [moe] add checkpoint for moe models
* [hotfix] fix bugs in unit test
2 years ago
YuliangLiu0306
fee2af8610
[autoparallel] adapt autoparallel with new analyzer ( #3261 )
...
* [autoparallel] adapt autoparallel with new analyzer
* fix all node handler tests
* polish
* polish
2 years ago
Ofey Chan
8706a8c66c
[NFC] polish colossalai/engine/gradient_handler/__init__.py code style ( #3329 )
2 years ago
yuxuan-lou
198a74b9fd
[NFC] polish colossalai/context/random/__init__.py code style ( #3327 )
2 years ago
YuliangLiu0306
fbd2a9e05b
[hotfix] meta_tensor_compatibility_with_torch2
2 years ago
Michelle
ad285e1656
[NFC] polish colossalai/fx/tracer/_tracer_utils.py ( #3323 )
...
* [NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style
* [NFC] polish colossalai/fx/tracer/_tracer_utils.py code style
---------
Co-authored-by: Qianran Ma <qianranm@luchentech.com>
2 years ago
Xu Kai
64350029fe
[NFC] polish colossalai/gemini/paramhooks/_param_hookmgr.py code style
2 years ago
RichardoLuo
1ce9d0c531
[NFC] polish initializer_data.py code style ( #3287 )
2 years ago
Ziheng Qin
1bed38ef37
[NFC] polish colossalai/cli/benchmark/models.py code style ( #3290 )
2 years ago
Kai Wang (Victor Kai)
964a28678f
[NFC] polish initializer_3d.py code style ( #3279 )
2 years ago
Sze-qq
94eec1c5ad
[NFC] polish colossalai/engine/gradient_accumulation/_gradient_accumulation.py code style ( #3277 )
...
Co-authored-by: siqi <siqi@siqis-MacBook-Pro.local>
2 years ago
Arsmart1
8af977f223
[NFC] polish colossalai/context/parallel_context.py code style ( #3276 )
2 years ago
Zirui Zhu
1168b50e33
[NFC] polish colossalai/engine/schedule/_pipeline_schedule_v2.py code style ( #3275 )
2 years ago
Tong Li
196d4696d0
[NFC] polish colossalai/nn/_ops/addmm.py code style ( #3274 )
2 years ago
lucasliunju
4b95464994
[NFC] polish colossalai/amp/__init__.py code style ( #3272 )
2 years ago
Xuanlei Zhao
6b3bb2c249
[NFC] polish code style ( #3273 )
2 years ago
CZYCW
4cadb25b96
[NFC] policy colossalai/fx/proxy.py code style ( #3269 )
2 years ago
Yuanchen
d58fa705b2
[NFC] polish code style ( #3268 )
...
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2 years ago
Camille Zhong
c4a226b729
[NFC] polish tensor_placement_policy.py code style ( #3265 )
2 years ago
CsRic
00778abc48
[NFC] polish colossalai/fx/passes/split_module.py code style ( #3263 )
...
Co-authored-by: csric <richcsr256@gmail.com>
2 years ago
jiangmingyan
488f37048c
[NFC] polish colossalai/global_variables.py code style ( #3259 )
...
Co-authored-by: luchen <luchen@luchendeMBP.lan>
2 years ago
LuGY
1ff7d5bfa5
[NFC] polish colossalai/engine/gradient_handler/_moe_gradient_handler.py ( #3260 )
2 years ago
dayellow
204ca2f09a
[NFC] polish colossalai/fx/profiler/experimental/profiler_module/embedding.py code style ( #3256 )
...
Co-authored-by: Minghao Huang <huangminghao@luchentech.com>
2 years ago
HELSON
02b058032d
[fx] meta registration compatibility ( #3253 )
...
* [fx] meta registration compatibility
* fix error
2 years ago
Frank Lee
73d3e4d309
[booster] implemented the torch ddd + resnet example ( #3232 )
...
* [booster] implemented the torch ddd + resnet example
* polish code
2 years ago
YH
1a229045af
Add interface for colo tesnor dp size ( #3227 )
2 years ago
YuliangLiu0306
4d5d8f98a4
[API] implement device mesh manager ( #3221 )
...
* [API] implement device mesh manager
* polish
2 years ago
Frank Lee
cd142fbefa
[api] implemented the checkpoint io module ( #3205 )
...
* [api] implemented the checkpoint io module
* polish code
* polish code
2 years ago
ver217
f8289d4221
[lazyinit] combine lazy tensor with dtensor ( #3204 )
...
* [lazyinit] lazy tensor add distribute
* [lazyinit] refactor distribute
* [lazyinit] add test dist lazy init
* [lazyinit] add verbose info for dist lazy init
* [lazyinit] fix rnn flatten weight op
* [lazyinit] polish test
* [lazyinit] polish test
* [lazyinit] fix lazy tensor data setter
* [lazyinit] polish test
* [lazyinit] fix clean
* [lazyinit] make materialize inplace
* [lazyinit] refactor materialize
* [lazyinit] refactor test distribute
* [lazyinit] fix requires_grad
* [lazyinit] fix tolist after materialization
* [lazyinit] refactor distribute module
* [lazyinit] polish docstr
* [lazyinit] polish lazy init context
* [lazyinit] temporarily skip test
* [lazyinit] polish test
* [lazyinit] add docstr
2 years ago
Frank Lee
e3ad88fb48
[booster] implemented the cluster module ( #3191 )
...
* [booster] implemented the cluster module
* polish code
2 years ago
YuliangLiu0306
f57d34958b
[FX] refactor experimental tracer and adapt it with hf models ( #3157 )
...
* pass gpt trace and meta_prop
* pass t5 trace and meta_prop
* [FX] refactor experimental tracer and adapt it with hf models
* pass all mainstream model zoo
* fix CI
* fix CI
* fix CI
* fix CI
* fix CI
* fix CI
* fix CI
* fix CI
* skip tests
* fix CI
* using packaging version
* polish
2 years ago
Frank Lee
e7f3bed2d3
[booster] added the plugin base and torch ddp plugin ( #3180 )
...
* [booster] added the plugin base and torch ddp plugin
* polish code
* polish code
* polish code
2 years ago
Zihao
18dbe76cae
[auto-parallel] add auto-offload feature ( #3154 )
...
* add auto-offload feature
* polish code
* fix syn offload runtime pass bug
* add offload example
* fix offload testing bug
* fix example testing bug
2 years ago
YuliangLiu0306
258b43317c
[hotfix] layout converting issue ( #3188 )
2 years ago
YH
80aed29cd3
[zero] Refactor ZeroContextConfig class using dataclass ( #3186 )
2 years ago
YH
9d644ff09f
Fix docstr for zero statedict ( #3185 )
2 years ago
zbian
7bc0afc901
updated flash attention usage
2 years ago
Frank Lee
a9b8402d93
[booster] added the accelerator implementation ( #3159 )
2 years ago
ver217
6ae8ed0407
[lazyinit] add correctness verification ( #3147 )
...
* [lazyinit] fix shared module
* [tests] add lazy init test utils
* [tests] add torchvision for lazy init
* [lazyinit] fix pre op fn
* [lazyinit] handle legacy constructor
* [tests] refactor lazy init test models
* [tests] refactor lazy init test utils
* [lazyinit] fix ops don't support meta
* [tests] lazy init test timm models
* [lazyinit] fix set data
* [lazyinit] handle apex layers
* [tests] lazy init test transformers models
* [tests] lazy init test torchaudio models
* [lazyinit] fix import path
* [tests] lazy init test torchrec models
* [tests] update torch version in CI
* [tests] revert torch version in CI
* [tests] skip lazy init test
2 years ago
Frank Lee
ed19290560
[booster] implemented mixed precision class ( #3151 )
...
* [booster] implemented mixed precision class
* polish code
2 years ago
YuliangLiu0306
2eca4cd376
[DTensor] refactor dtensor with new components ( #3089 )
...
* [DTensor] refactor dtensor with new components
* polish
2 years ago
ver217
ed8f60b93b
[lazyinit] refactor lazy tensor and lazy init ctx ( #3131 )
...
* [lazyinit] refactor lazy tensor and lazy init ctx
* [lazyinit] polish docstr
* [lazyinit] polish docstr
2 years ago
Frank Lee
95a36eae63
[kernel] added kernel loader to softmax autograd function ( #3093 )
...
* [kernel] added kernel loader to softmax autograd function
* [release] v0.2.6
2 years ago
Super Daniel
fff98f06ed
[analyzer] a minimal implementation of static graph analyzer ( #2852 )
...
* [hotfix] meta tensor default device.
* [siu] add experimental submodules to main branch.
* [siu]
* [siu]
* [analyzer] init.
* [analyzer] readme.
* [analyzer] readme.
* [analyzer] readme.
* [analyzer] readme.
* [test] add test.
* Update symbolic_trace.py
* mark skip tests.
* try except.
* try except.
* try except.
* s
* init
* init
* fix
* skip
* skip
---------
Co-authored-by: Daniel Shao <superdainiu@MININT-PVARVID.fareast.corp.microsoft.com>
Co-authored-by: Daniel Shao <superdainiu@Daniels-Mac.local>
2 years ago
Xuanlei Zhao
10c61de2f7
[autochunk] support vit ( #3084 )
...
support vit for autochunk
* support some new ops for vit
* fix some bugs
* add test for vit
2 years ago
YuliangLiu0306
8e4e8601b7
[DTensor] implement layout converter ( #3055 )
...
* [DTensor] refactor LayoutConverter for DTensor
* polish code
* polish docstring
2 years ago
Frank Lee
f19b49e164
[booster] init module structure and definition ( #3056 )
2 years ago
Xuanlei Zhao
2ca9728cbb
[autochunk] refactor chunk memory estimation ( #2762 )
...
* refact memory code
* dont log free var memory
* add memory align
* update chunk target
* update setting for new memory
* finish test
* update tracer
* update typo
* update test
2 years ago
YuliangLiu0306
29386a54e6
[DTensor] refactor CommSpec ( #3034 )
2 years ago
YuliangLiu0306
cd2b0eaa8d
[DTensor] refactor sharding spec ( #2987 )
...
* [autoparallel] refactor sharding spec
* rename function name
2 years ago
Ziyue Jiang
400f63012e
[pipeline] Add Simplified Alpa DP Partition ( #2507 )
...
* add alpa dp split
* add alpa dp split
* use fwd+bwd instead of fwd only
---------
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Super Daniel
b42d3d28ed
[fx] remove depreciated algorithms. ( #2312 ) ( #2313 )
2 years ago
github-actions[bot]
82503a96f2
[format] applied code formatting on changed files in pull request 2997 ( #3008 )
...
Co-authored-by: github-actions <github-actions@github.com>
2 years ago
binmakeswell
52a5078988
[doc] add ISC tutorial ( #2997 )
...
* [doc] add ISC tutorial
* [doc] add ISC tutorial
* [doc] add ISC tutorial
* [doc] add ISC tutorial
2 years ago
ver217
823f3b9cf4
[doc] add deepspeed citation and copyright ( #2996 )
...
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
2 years ago
YuliangLiu0306
e414e4092b
[DTensor] implementation of dtensor ( #2946 )
...
* [DTensor] implementation of dtensor
* test layout convert
* polish
2 years ago
YuliangLiu0306
47fb214b3b
[hotfix] add shard dim to aviod backward communication error ( #2954 )
2 years ago
ver217
090f14fd6b
[misc] add reference ( #2930 )
...
* [misc] add reference
* [misc] add license
2 years ago
YuliangLiu0306
197d0bf4ed
[autoparallel] apply repeat block to reduce solving time ( #2912 )
2 years ago
YH
a848091141
Fix port exception type ( #2925 )
2 years ago
zbian
61e687831d
fixed using zero with tp cannot access weight correctly
2 years ago
YH
7b13f7db18
[zero] trivial zero optimizer refactoring ( #2869 )
...
* Fix mionr grad store interface
* Apply lint
2 years ago
Jiatong (Julius) Han
8c8a39be95
[hotfix]: Remove math.prod dependency ( #2837 )
...
* Remove math.prod dependency
* Fix style
* Fix style
---------
Co-authored-by: Jiatong Han <jiatong.han@u.nus.edu>
2 years ago
YuliangLiu0306
819e25d8b1
[hotfix] fix autoparallel compatibility test issues ( #2754 )
2 years ago
YuliangLiu0306
0f392d7403
[autoparallel] find repeat blocks ( #2854 )
...
* [autoparallel] find repeat blocks
* polish
* polish
* polish
2 years ago
junxu
c52edcf0eb
Rename class method of ZeroDDP ( #2692 )
2 years ago
HELSON
6e4ac08172
[hotfix] fix chunk size can not be divided ( #2867 )
...
* [hotfix] fix chunk size can not be divided
* [hotfix] use numpy for python3.8
2 years ago
Boyuan Yao
eae77c831d
[autoparallel] Patch meta information for nodes that will not be handled by SPMD solver ( #2823 )
...
* [autoparallel] non spmd meta information generator
* [autoparallel] patch meta information for non spmd nodes
2 years ago
Boyuan Yao
c7764d3f22
[autoparallel] Patch meta information of `torch.where` ( #2822 )
...
* [autoparallel] patch meta information of torch.where
* [autoparallel] pre-commit modified
2 years ago
Boyuan Yao
fcc4097efa
[autoparallel] Patch meta information of `torch.tanh()` and `torch.nn.Dropout` ( #2773 )
...
* [autoparallel] tanh meta information
* [autoparallel] remove redundant code
* [autoparallel] patch meta information of torch.nn.Dropout
2 years ago
Frank Lee
935346430f
[cli] handled version check exceptions ( #2848 )
...
* [cli] handled version check exceptions
* polish code
2 years ago
Frank Lee
918bc94b6b
[triton] added copyright information for flash attention ( #2835 )
...
* [triton] added copyright information for flash attention
* polish code
2 years ago
Boyuan Yao
7ea6bc7f69
[autoparallel] Patch tensor related operations meta information ( #2789 )
...
* [autoparallel] tensor related meta information prototype
* [autoparallel] tensor related meta information
* [autoparallel] tensor related meta information
* [autoparallel] tensor related meta information
* [autoparallel] tensor related meta information
2 years ago
Michelle
c008d4ad0c
[NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style ( #2744 )
2 years ago
YuliangLiu0306
2059fdd6b0
[hotfix] add copyright for solver and device mesh ( #2803 )
...
* [hotfix] add copyright for solver and device mesh
* add readme
* add alpa license
* polish
2 years ago
Boyuan Yao
8593ae1a3f
[autoparallel] rotor solver refactor ( #2813 )
...
* [autoparallel] rotor solver refactor
* [autoparallel] rotor solver refactor
2 years ago
HELSON
56ddc9ca7a
[hotfix] add correct device for fake_param ( #2796 )
2 years ago
Boyuan Yao
a2b43e393d
[autoparallel] Patch meta information of `torch.nn.Embedding` ( #2760 )
...
* [autoparallel] embedding metainfo
* [autoparallel] fix function name in test_activation_metainfo
* [autoparallel] undo changes in activation metainfo and related tests
2 years ago
Boyuan Yao
8e3f66a0d1
[zero] fix wrong import ( #2777 )
2 years ago
Nikita Shulga
01066152f1
Don't use `torch._six` ( #2775 )
...
* Don't use `torch._six`
This is a private API which is gone after https://github.com/pytorch/pytorch/pull/94709
* Update common.py
2 years ago
binmakeswell
93b788b95a
Merge branch 'main' into fix/format
2 years ago
xyupeng
2fd528b9f4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style ( #2737 )
2 years ago
YuliangLiu0306
1dc003c169
[autoparallel] distinguish different parallel strategies ( #2699 )
2 years ago
YH
ae86a29e23
Refact method of grad store ( #2687 )
2 years ago
Zirui Zhu
c9e3ee389e
[NFC] polish colossalai/context/process_group_initializer/initializer_2d.py code style ( #2726 )
2 years ago
Zangwei Zheng
1819373e5c
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/batch_norm_handler.py code style ( #2728 )
2 years ago
Wangbo Zhao(黑色枷锁)
8331420520
[NFC] polish colossalai/cli/cli.py code style ( #2734 )
2 years ago
ziyuhuang123
d344313533
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style ( #2725 )
2 years ago
Xue Fuzhao
e81caeb4bc
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/cost_graph.py code style ( #2720 )
...
Co-authored-by: Fuzhao Xue <fuzhao@login2.ls6.tacc.utexas.edu>
2 years ago
yuxuan-lou
51c45c2460
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/where_handler.py code style ( #2723 )
2 years ago
YuliangLiu0306
21d6a48f4d
[autoparallel] add shard option ( #2696 )
...
* [autoparallel] add shard option
* polish
2 years ago
YuliangLiu0306
5b24987fa7
[autoparallel] fix parameters sharding bug ( #2716 )
2 years ago
Ziyue Jiang
4603538ddd
[NFC] posh colossalai/context/process_group_initializer/initializer_sequence.py code style ( #2712 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
YuliangLiu0306
cb2c6a2415
[autoparallel] refactor runtime pass ( #2644 )
...
* [autoparallel] refactor runtime pass
* add unit test
* polish
2 years ago
Zihao
b3d10db5f1
[NFC] polish colossalai/cli/launcher/__init__.py code style ( #2709 )
2 years ago
YuliangLiu0306
0b2a738393
[autoparallel] remove deprecated codes ( #2664 )
2 years ago
YuliangLiu0306
7fa6be49d2
[autoparallel] test compatibility for gemini and auto parallel ( #2700 )
2 years ago
CZYCW
4ac8bfb072
[NFC] polish colossalai/engine/gradient_handler/utils.py code style ( #2708 )
2 years ago
Liu Ziming
6427c406cf
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/strategy_generator.py code style ( #2695 )
...
Co-authored-by: shenggan <csg19971016@gmail.com>
2 years ago
アマデウス
534f68c83c
[NFC] polish pipeline process group code style ( #2694 )
2 years ago
LuGY
56ff1921e9
[NFC] polish colossalai/context/moe_context.py code style ( #2693 )
2 years ago
Shawn-Kong
1712da2800
[NFC] polish colossalai/gemini/gemini_context.py code style ( #2690 )
2 years ago
HELSON
df4f020ee3
[zero1&2] only append parameters with gradients ( #2681 )
2 years ago
ver217
f0aa191f51
[gemini] fix colo_init_context ( #2683 )
2 years ago
Boyuan Yao
40c916b192
[autoparallel] Patch meta information of `torch.nn.functional.softmax` and `torch.nn.Softmax` ( #2674 )
...
* [autoparallel] softmax metainfo
* [autoparallel] softmax metainfo
2 years ago
HELSON
8213f89fd2
[gemini] add fake_release_chunk for keep-gathered chunk in the inference mode ( #2671 )
2 years ago
binmakeswell
9ab14b20b5
[doc] add CVPR tutorial ( #2666 )
2 years ago
Boyuan Yao
0385b26ebf
[autoparallel] Patch meta information of `torch.nn.LayerNorm` ( #2647 )
...
* [autoparallel] layernorm metainfo patch
* [autoparallel] polish test
2 years ago
YuliangLiu0306
37df666f38
[autoparallel] refactor handlers which reshape input tensors ( #2615 )
...
* [autoparallel] refactor handlers which reshape input tensors
* polish
2 years ago
YuliangLiu0306
28398f1c70
add overlap option ( #2613 )
2 years ago
YuliangLiu0306
cb3d1bef62
[autoparallel] adapt autoparallel tests with latest api ( #2626 )
2 years ago
Boyuan Yao
90a9fdd91d
[autoparallel] Patch meta information of `torch.matmul` ( #2584 )
...
* [autoparallel] matmul metainfo
* [auto_parallel] remove unused print
* [tests] skip test_matmul_handler when torch version is lower than 1.12.0
2 years ago
oahzxl
6ba8364881
[autochunk] support diffusion for autochunk ( #2621 )
...
* add alphafold benchmark
* renae alphafold test
* rename tests
* rename diffuser
* renme
* rename
* update transformer
* update benchmark
* update benchmark
* update bench memory
* update transformer benchmark
* rename
* support diffuser
* support unet metainfo prop
* fix bug and simplify code
* update linear and support some op
* optimize max region search, support conv
* update unet test
* support some op
* support groupnorm and interpolate
* update flow search
* add fix dim in node flow
* fix utils
* rename
* support diffusion
* update diffuser
* update chunk search
* optimize imports
* import
* finish autochunk
2 years ago
Frank Lee
8518263b80
[test] fixed the triton version for testing ( #2608 )
2 years ago
HELSON
552183bb74
[polish] polish ColoTensor and its submodules ( #2537 )
2 years ago
Frank Lee
dd14783f75
[kernel] fixed repeated loading of kernels ( #2549 )
...
* [kernel] fixed repeated loading of kernels
* polish code
* polish code
2 years ago
ver217
5b1854309a
[hotfix] fix zero ddp warmup check ( #2545 )
2 years ago
oahzxl
fa3d66feb9
support unet metainfo prop ( #2544 )
2 years ago
oahzxl
05671fcb42
[autochunk] support multi outputs chunk search ( #2538 )
...
Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy.
1. rewrite search strategy to support multi outputs chunk search
2. fix many, many bugs
3. update tests
2 years ago
oahzxl
63199c6687
[autochunk] support transformer ( #2526 )
2 years ago
HELSON
a4ed9125ac
[hotfix] fix lightning error ( #2529 )
2 years ago
HELSON
66dfcf5281
[gemini] update the gpt example ( #2527 )
2 years ago
HELSON
b528eea0f0
[zero] add zero wrappers ( #2523 )
...
* [zero] add zero wrappers
* change names
* add wrapper functions to init
2 years ago
Super Daniel
c198c7c0b0
[hotfix] meta tensor default device. ( #2510 )
2 years ago
HELSON
077a5cdde4
[zero] fix gradient clipping in hybrid parallelism ( #2521 )
...
* [zero] fix gradient clipping in hybrid parallelism
* [testing] change model name to avoid pytest warning
* [hotfix] fix unit testing
2 years ago
YuliangLiu0306
aa0f6686f9
[autoparallel] accelerate gpt2 training ( #2495 )
2 years ago
HELSON
707b11d4a0
[gemini] update ddp strict mode ( #2518 )
...
* [zero] add strict ddp mode for chunk init
* [gemini] update gpt example
2 years ago
HELSON
2d1a7dfe5f
[zero] add strict ddp mode ( #2508 )
...
* [zero] add strict ddp mode
* [polish] add comments for strict ddp mode
* [zero] fix test error
2 years ago
oahzxl
c04f183237
[autochunk] support parsing blocks ( #2506 )
2 years ago
Super Daniel
35c0c0006e
[utils] lazy init. ( #2148 )
...
* [utils] lazy init.
* [utils] remove description.
* [utils] complete.
* [utils] finalize.
* [utils] fix names.
2 years ago
oahzxl
72341e65f4
[auto-chunk] support extramsa ( #3 ) ( #2504 )
2 years ago
Ziyue Jiang
0f02b8c6e6
add avg partition ( #2483 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
アマデウス
99d9713b02
Revert "Update parallel_context.py ( #2408 )"
...
This reverts commit 7d5640b9db
.
2 years ago
oahzxl
ecccc91f21
[autochunk] support autochunk on evoformer ( #2497 )
2 years ago
oahzxl
5db3a5bf42
[fx] allow control of ckpt_codegen init ( #2498 )
...
* [fx] allow control of ckpt_codegen init
Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so.
So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__.
* code style
2 years ago
HELSON
d565a24849
[zero] add unit testings for hybrid parallelism ( #2486 )
2 years ago
oahzxl
4953b4ace1
[autochunk] support evoformer tracer ( #2485 )
...
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2 years ago
YuliangLiu0306
67e1912b59
[autoparallel] support origin activation ckpt on autoprallel system ( #2468 )
2 years ago
Ziyue Jiang
fef5c949c3
polish pp middleware ( #2476 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
HELSON
a5dc4253c6
[zero] polish low level optimizer ( #2473 )
2 years ago
Frank Lee
8b7495dd54
[example] integrate seq-parallel tutorial with CI ( #2463 )
2 years ago
Jiarui Fang
867c8c2d3a
[zero] low level optim supports ProcessGroup ( #2464 )
2 years ago
Frank Lee
14d9299360
[cli] fixed hostname mismatch error ( #2465 )
2 years ago
Haofan Wang
9358262992
Fix False warning in initialize.py ( #2456 )
...
* Update initialize.py
* pre-commit run check
2 years ago
YuliangLiu0306
8221fd7485
[autoparallel] update binary elementwise handler ( #2451 )
...
* [autoparallel] update binary elementwise handler
* polish
2 years ago
HELSON
2bfeb24308
[zero] add warning for ignored parameters ( #2446 )
2 years ago
Frank Lee
39163417a1
[example] updated the hybrid parallel tutorial ( #2444 )
...
* [example] updated the hybrid parallel tutorial
* polish code
2 years ago
HELSON
5521af7877
[zero] fix state_dict and load_state_dict for ddp ignored parameters ( #2443 )
...
* [ddp] add is_ddp_ignored
[ddp] rename to is_ddp_ignored
* [zero] fix state_dict and load_state_dict
* fix bugs
* [zero] update unit test for ZeroDDP
2 years ago
YuliangLiu0306
2731531bc2
[autoparallel] integrate device mesh initialization into autoparallelize ( #2393 )
...
* [autoparallel] integrate device mesh initialization into autoparallelize
* add megatron solution
* update gpt autoparallel examples with latest api
* adapt beta value to fit the current computation cost
2 years ago
Frank Lee
c72c827e95
[cli] provided more details if colossalai run fail ( #2442 )
2 years ago
Super Daniel
c41e59e5ad
[fx] allow native ckpt trace and codegen. ( #2438 )
2 years ago
YuliangLiu0306
41429b9b28
[autoparallel] add shard option ( #2423 )
2 years ago
HELSON
7829aa094e
[ddp] add is_ddp_ignored ( #2434 )
...
[ddp] rename to is_ddp_ignored
2 years ago
HELSON
bb4e9a311a
[zero] add inference mode and its unit test ( #2418 )
2 years ago
Jiarui Fang
93f62dd152
[autochunk] add autochunk feature
2 years ago
HELSON
dddacd2d2c
[hotfix] add norm clearing for the overflow step ( #2416 )
2 years ago
oahzxl
7ab2db206f
adapt new fx
2 years ago
oahzxl
e532679c95
Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk
2 years ago
Haofan Wang
7d5640b9db
Update parallel_context.py ( #2408 )
2 years ago
oahzxl
fd818cf144
change imports
2 years ago
oahzxl
a591d45b29
add available
2 years ago
oahzxl
615e7e68d9
update doc
2 years ago
oahzxl
7d4abaa525
add doc
2 years ago
oahzxl
1be0ac3cbf
add doc for trace indice
2 years ago
oahzxl
0b6af554df
remove useless function
2 years ago
oahzxl
d914a21d64
rename
2 years ago
oahzxl
865f2e0196
rename
2 years ago
HELSON
ea13a201bb
[polish] polish code for get_static_torch_model ( #2405 )
...
* [gemini] polish code
* [testing] remove code
* [gemini] make more robust
2 years ago
oahzxl
a4ed5b0d0d
rename in doc
2 years ago
oahzxl
1bb1f2ad89
rename
2 years ago
oahzxl
cb9817f75d
rename function from index to indice
2 years ago
oahzxl
0ea903b94e
rename trace_index to trace_indice
2 years ago
Frank Lee
551cafec14
[doc] updated kernel-related optimisers' docstring ( #2385 )
...
* [doc] updated kernel-related optimisers' docstring
* polish doc
2 years ago
oahzxl
065f0b4c27
add doc for search
2 years ago
oahzxl
a68d240ed5
add doc for search chunk
2 years ago
oahzxl
1951f7fa87
code style
2 years ago
oahzxl
212b5b1b5f
add comments
2 years ago
oahzxl
19cc64b1d3
remove autochunk_available
2 years ago
eric8607242
9880fd2cd8
Fix state_dict key missing issue of the ZeroDDP ( #2363 )
...
* Fix state_dict output for ZeroDDP duplicated parameters
* Rewrite state_dict based on get_static_torch_model
* Modify get_static_torch_model to be compatible with the lower version (ZeroDDP)
2 years ago
oahzxl
4d223e18a2
fix typo
2 years ago
Frank Lee
ce08661eb1
[cli] updated installation check cli for aot/jit build ( #2395 )
2 years ago
jiaruifang
69d9180c4b
[hotfix] issue #2388
2 years ago
Jiarui Fang
4e96039649
[device] find best logical mesh
2 years ago
Jiarui Fang
8f72b6f8fb
[hotfix] fix implement error in diffusers
2 years ago
Frank Lee
40d376c566
[setup] support pre-build and jit-build of cuda kernels ( #2374 )
...
* [setup] support pre-build and jit-build of cuda kernels
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
2 years ago
1SAA
33f3023e19
[hotfix] fix implement error in diffusers
2 years ago
Jiarui Fang
12c8bf38d7
[Pipeline] Refine GPT PP Example
2 years ago
oahzxl
8a989a0d89
code style
2 years ago
oahzxl
c3a2bf48b4
code style
2 years ago
oahzxl
a6cdbf9161
seperate trace flow
2 years ago
oahzxl
4748967fb1
ad reorder graph
2 years ago
oahzxl
da4076846d
rename
2 years ago
oahzxl
c3d72f7db9
seperate reorder
2 years ago
binmakeswell
a881d6d000
Revert "[NFC] polish code format" ( #2372 )
2 years ago
Ziyue Jiang
9ae9e74017
fix diff device in some partition
2 years ago
Jiarui Fang
0dcc410f57
[NFC] polish code format
2 years ago
oahzxl
6685a9d022
seperate non chunk input
2 years ago
binmakeswell
d634eae05b
Revert "[NFC] polish code format ( #2367 )" ( #2371 )
...
This reverts commit 1f8ab6f1f5
.
2 years ago
oahzxl
f856611d21
seperate prepose_nodes
2 years ago
Shawn-Kong
d42aecdda1
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style ( #2368 )
2 years ago
Jiarui Fang
1aaeb596c6
[example] gpt, shard init on all processes ( #2366 )
2 years ago
oahzxl
f4a1607e56
seperate input node dim search
2 years ago
binmakeswell
1f8ab6f1f5
[NFC] polish code format ( #2367 )
2 years ago
oahzxl
ae27a8b26d
seperate flow tracer
2 years ago
oahzxl
fd87d78a28
rename ambiguous variable
2 years ago
oahzxl
2bde9d2b7f
code format
2 years ago
oahzxl
8a634af2f5
close mem and code print
2 years ago
oahzxl
1a6d2a740b
take apart chunk code gen
2 years ago
ExtremeViscent
ac0d30fe2e
[NFC] polish batch_norm_handler.py code style ( #2359 )
2 years ago
HELSON
48d33b1b17
[gemini] add get static torch model ( #2356 )
2 years ago
oahzxl
efb1c64c30
restruct dir
2 years ago
ziyuhuang123
7080a8edb0
[workflow]New version: Create workflow files for examples' auto check ( #2298 )
...
* [workflows]bug_repair
* [workflow]new_pr_fixing_bugs
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
2 years ago
LuGY
e11a005c02
[NFC] polish colossalai/auto_parallel/tensor_shard/utils/factory.py code style ( #2349 )
2 years ago
YuliangLiu0306
b5a3a4a65f
[device] find best logical mesh
2 years ago
yuxuan-lou
28e2d16794
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style ( #2340 )
2 years ago
YuliangLiu0306
9c9246c0d9
[device] alpha beta profiler ( #2311 )
...
* [device] alpha beta profiler
* add usage
* fix variable name
2 years ago
Maruyama_Aya
bd12a49e2a
[NFC] polish <colossalai/auto_parallel/tensor_shard/deprecated/constants.py> code style ( #2339 )
2 years ago
Zihao
35427bcab4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/unary_elementwise_handler.py code style ( #2326 )
2 years ago
Jiarui Fang
db6eea3583
[builder] reconfig op_builder for pypi install ( #2314 )
2 years ago
Junming Wu
4a79c10750
[NFC] polish colossalai/cli/benchmark/__init__.py code style ( #2308 )
2 years ago
Ofey Chan
87d2defda6
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style ( #2305 )
2 years ago
ver217
116e3d0b8f
[NFC] polish communication/p2p_v2.py code style ( #2303 )
2 years ago
xyupeng
b965585d05
[NFC] polish colossalai/amp/torch_amp/torch_amp.py code style ( #2290 )
2 years ago
Zangwei Zheng
d1e5bafcd4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style ( #2291 )
2 years ago
shenggan
950685873f
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style ( #2292 )
2 years ago
Ziheng Qin
3041014089
[NFC] polish colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py code style ( #2299 )
...
Co-authored-by: henryqin1997 <henryqin1997@gamil.com>
2 years ago
アマデウス
49715a78f0
[NFC] polish colossalai/cli/benchmark/benchmark.py code style ( #2287 )
2 years ago
Zirui Zhu
1c29b173c9
[NFC] polish colossalai/auto_parallel/tensor_shard/node_handler/getitem_handler.py code style ( #2289 )
2 years ago
Zihao
3a02b46447
[auto-parallel] refactoring ColoTracer ( #2118 )
...
* add meta_data_computing
* add checkpoint_annotation
* rename proxy.data to proxy.meta_data and add bias addition pass
* polish code
* delete meta_prop_pass invoke and rename ori_node to orig_node
* add TracerType
* unify meta data computing
* delete TracerType
* handle setitem operation
* operator.setitem
2 years ago
HELSON
5d3a2be3af
[amp] add gradient clipping for unit tests ( #2283 )
...
* [amp] add gradient clipping in unit tests
* fix bugs
2 years ago
Boyuan Yao
d45695d94e
Merge pull request #2258 from hpcaitech/debug/ckpt-autoparallel
...
[autockpt] provide option for activation checkpoint search in SPMD solver
2 years ago
Jiarui Fang
16cc8e6aa7
[builder] MOE builder ( #2277 )
2 years ago
Boyuan Yao
b904748210
[autoparallel] bypass MetaInfo when unavailable and modify BCAST_FUNC_OP metainfo ( #2293 )
...
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline
* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop
* [autoparallel] specifycomm nodes' memory cost in construct chain
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] bypass metainfo when available and modify BCAST_FUNC_OP
2 years ago
Super Daniel
8ea50d999e
[hotfix] pass a parameter. ( #2288 )
...
* [autockpt] make it work.
* [autockpt] linearize / merge shape-consistency nodes.
* [autockpt] considering parameter and optimizer weights.
* [hotfix] pass a parameter.
2 years ago
zbian
e94c79f15b
improved allgather & reducescatter for 3d
2 years ago
HELSON
62c38e3330
[zero] polish low level zero optimizer ( #2275 )
2 years ago
Ziyue Jiang
ac863a01d6
[example] add benchmark ( #2276 )
...
* add benchmark
* merge common func
* add total and avg tflops
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Boyuan Yao
22e947f982
[autoparallel] fix runtime apply memory estimation ( #2281 )
...
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline
* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop
* [autoparallel] specifycomm nodes' memory cost in construct chain
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
* [autoparallel] fix wrong runtime apply calculation
2 years ago
Super Daniel
8e8900ff3f
[autockpt] considering parameter and optimizer weights. ( #2279 )
...
* [autockpt] make it work.
* [autockpt] linearize / merge shape-consistency nodes.
* [autockpt] considering parameter and optimizer weights.
2 years ago
YuliangLiu0306
f027ef7913
[hotfix] fix fp16 optimzier bug ( #2273 )
2 years ago
YuliangLiu0306
fb87322773
[autoparallel] fix spelling error ( #2270 )
2 years ago
Jiarui Fang
af32022f74
[Gemini] fix the convert_to_torch_module bug ( #2269 )
2 years ago
Super Daniel
b0d21d0c4f
[autockpt] linearize / merge shape-consistency nodes. ( #2271 )
...
* [autockpt] make it work.
* [autockpt] linearize / merge shape-consistency nodes.
2 years ago
YuliangLiu0306
4b29112ab2
[autoparallel] gpt2 autoparallel examples ( #2267 )
...
* [autoparallel] gpt2 autoparallel examples
* polish code
* polish code
2 years ago
Ziyue Jiang
8b045b3c1f
[Pipeline Middleware] Reduce comm redundancy by getting accurate output ( #2232 )
...
* move to cpu to avoid dead lock
* get output by offsets
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Boyuan Yao
5c2ef9fc76
[autoparallel] modify comm nodes' memory cost in construct chain ( #2263 )
...
* [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline
* [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop
* [autoparallel] specifycomm nodes' memory cost in construct chain
2 years ago
Boyuan Yao
1ea99b869e
[autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline ( #2261 )
2 years ago
Super Daniel
3ccf58aa76
[autockpt] make it work. ( #2257 )
2 years ago
Boyuan Yao
ac3739930d
[autoparallel] modify construct chain in rotor solver ( #2254 )
2 years ago
Boyuan Yao
ab38aebace
[autoparallel] Hook all meta information on ResNet nodes for auto activation checkpoint ( #2248 )
...
* [autoparallel] hook node meta on graph nodes for checkpoint solver
* [autoparallel] polish code
* [autoparallel] restore some node handlers
* colossalai/auto_parallel/passes/meta_info_prop.py
* [autoparallel] remove some unused import
* [autoparallel] hook bwd_mem_out
2 years ago
Boyuan Yao
c8c79102f0
[autoparallel] patch torch.flatten metainfo for autoparallel ( #2247 )
...
* [autoparallel] patch torch.flatten
2 years ago
YuliangLiu0306
8897b8f753
[autoparallel] autoparallel initialize ( #2238 )
2 years ago
xcnick
85178a397a
[hotfix] fix error for torch 2.0 ( #2243 )
2 years ago
Super Daniel
b7d0990c61
[autoparallel] fix construct meta info. ( #2245 )
2 years ago
Ziyue Jiang
57929a6210
fix type of num_worker_threads ( #2237 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Jiarui Fang
db4cbdc7fb
[builder] builder for scaled_upper_triang_masked_softmax ( #2234 )
2 years ago
Super Daniel
78483a9fdd
[logger] hotfix, missing _FORMAT ( #2231 )
2 years ago
Jiarui Fang
54de05da5d
[builder] polish builder with better base class ( #2216 )
...
* [builder] polish builder
* remove print
2 years ago
YuliangLiu0306
3b1b91eaf4
[autoparallel] record parameter attribute in colotracer ( #2217 )
...
* [autoparallel] record parameter attribute in collotracer
* [autoparallel] fix construct_meta_info bug
2 years ago
Jiarui Fang
7675792100
[builder] raise Error when CUDA_HOME is not set ( #2213 )
2 years ago
Jiarui Fang
d5e3e3ec01
[example] update gpt example for larger model scale ( #2211 )
2 years ago
Boyuan Yao
24246f7aa5
[autoparallel] Attach input, buffer and output tensor to MetaInfo class ( #2162 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
* [autoparallel] add F.linear metainfo generator
* [autoparallel] add binary elementwise metainfo
* [fx] recover profiler
* [autoparallel] fix forward memory calculation
* [autoparallel] modify constants.py
* [autoparallel] remove redundant print
* [autoparallel] add F.conv metainfo
* [autoparallel] linear fix
* [autoparallel] memory estimation for communication actions
* [autoparallel] fix docstring
* [autoparallel] fix variables name
* [autoparallel] attach tensor to metainfo class
* [autoparallel] fix dangerous try except
* [autoparallel] attach memory cost to shape consistency node
* [autoparallel] attach shape consistency node's metainfo to the node
* [autoparallel] remove todo in shape consistency memory estimation
* [autoparallel] fix the annotation
2 years ago
Boyuan Yao
d0bc5a1b34
[autoparallel] new metainfoprop based on metainfo class ( #2179 )
...
* [autoparallel] new metainfoprop to combine SPMD solver and checkpoint solver
* [autoparallel] new metainfoprop to combine SPMD solver and checkpoint solver
* [autoparallel] modify placeholder handler
* [autoparallel] modify metainfoprop
* [autoparallel] fix function typo
* [autoparallel] fix placeholder handler
2 years ago
YuliangLiu0306
78509124d3
[autoparallel] update getitem handler ( #2207 )
2 years ago
Jiarui Fang
1cb532ffec
[builder] multihead attn runtime building ( #2203 )
...
* [hotfix] correcnt cpu_optim runtime compilation
* [builder] multihead attn
* fix bug
* fix a bug
2 years ago
Tongping Liu
8e22c38b89
[hotfix] Fixing the bug related to ipv6 support
...
Co-authored-by: ByteDance <tongping.liu@bytedance.com>
2 years ago
YuliangLiu0306
4851f2d607
[autoparallel] update_getattr_handler ( #2193 )
2 years ago
Jiarui Fang
5682e6d346
[hotfix] correcnt cpu_optim runtime compilation ( #2197 )
2 years ago
HELSON
2458659919
[zero] fix error for BEiT models ( #2169 )
...
* [zero] fix error for BEiT models
* [ColoParameter] add unpack operation for tuple arguments
* fix bugs
* fix chunkv2 unit testing
* add assertion for gradient state
2 years ago
Jiarui Fang
355ffb386e
[builder] unified cpu_optim fused_optim inferface ( #2190 )
2 years ago
Jiarui Fang
9587b080ba
[builder] use runtime builder for fused_optim ( #2189 )
2 years ago
Jiarui Fang
bc0e271e71
[buider] use builder() for cpu adam and fused optim in setup.py ( #2187 )
2 years ago
Jiarui Fang
d42afd30f8
[builder] runtime adam and fused_optim builder ( #2184 )
2 years ago
YuliangLiu0306
550f8f8905
[autoparallel] integrate_gpt_related_tests ( #2134 )
...
* [autoparallel] integrate_gpt_related_tests
* polish code
* polish code
* add GPT2Model into runtime test
2 years ago
Ziyue Jiang
59e343328d
[Pipeline Middleware ] Fix deadlock when num_microbatch=num_stage ( #2156 )
...
* add splitter
* polish code
* remove comment
* fix async nan by moving to cpu first
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Tongping Liu
ab54fed292
[hotfix] add kwargs for colo_addmm ( #2171 )
2 years ago
アマデウス
622f863291
[hotfix] Jit type hint #2161 ( #2164 )
2 years ago
Zihao
12e7bcd720
register meta func for rnn ( #2159 )
2 years ago
Boyuan Yao
cfe2a9bd90
[autoparallel] memory estimation for shape consistency ( #2144 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
* [autoparallel] add F.linear metainfo generator
* [autoparallel] add binary elementwise metainfo
* [fx] recover profiler
* [autoparallel] fix forward memory calculation
* [autoparallel] modify constants.py
* [autoparallel] remove redundant print
* [autoparallel] add F.conv metainfo
* [autoparallel] linear fix
* [autoparallel] memory estimation for communication actions
* [autoparallel] fix docstring
* [autoparallel] fix variables name
2 years ago
Jiarui Fang
b87496a66b
[hotfix] fix auto policy of test_sharded_optim_v2 ( #2157 )
2 years ago
YuliangLiu0306
16335cb537
[hotfix] fix aten default bug ( #2158 )
2 years ago
HELSON
a7d95b7024
[example] add zero1, zero2 example in GPT examples ( #2146 )
...
* [example] add zero1 and zero2 for GPT
* update readme in gpt example
* polish code
* change init value
* update readme
2 years ago
YuliangLiu0306
1cce6e36ca
[autoparallel] use metainfo in handler ( #2149 )
2 years ago
Jiarui Fang
2827f41898
[Gemini] GeminiDPP convert to PyTorch Module. ( #2151 )
2 years ago
Jiarui Fang
bdef9dfdbe
[NFC] remove useless graph node code ( #2150 )
2 years ago
BlueRum
b3f73ce1c8
[Gemini] Update coloinit_ctx to support meta_tensor ( #2147 )
2 years ago
Zihao
a128eec9d5
register aten._convolution.default ( #2137 )
2 years ago
Jiarui Fang
ee287620f0
[Gemini] revert ZeROInitCtx related tracer ( #2138 )
2 years ago
アマデウス
077a66dd81
updated attention kernel ( #2133 )
2 years ago
YuliangLiu0306
a3c6924deb
[autoparallel] process size nodes in runtime pass ( #2130 )
...
* [autoparallel] process size nodes in runtime pass
* polish code
2 years ago
YuliangLiu0306
536560ccc0
[autoparallel] implement softmax handler ( #2132 )
2 years ago
Jiarui Fang
c89c66a858
[Gemini] update API of the chunkmemstatscollector. ( #2129 )
2 years ago
Jiarui Fang
2938edf446
[Gemini] update the non model data record method in runtime memory tracer ( #2128 )
2 years ago
Jiarui Fang
8fac837679
[Gemini] update non model data calculation method ( #2126 )
2 years ago
Jiarui Fang
5efda69735
[Gemini] hotfix the unittest bugs ( #2125 )
2 years ago
Jiarui Fang
05bb28aacf
[Gemini] mapping of preop timestep and param ( #2124 )
2 years ago
YuliangLiu0306
cd0af9f7f6
[autoparallel] gpt2lp runtimee test ( #2113 )
2 years ago
Jiarui Fang
9214d1fe28
[Gemini] chunk init using runtime visited param order ( #2115 )
2 years ago
HELSON
e7d3afc9cc
[optimizer] add div_scale for optimizers ( #2117 )
...
* [optimizer] add div_scale for optimizers
* [zero] use div_scale in zero optimizer
* fix testing error
2 years ago
Jiarui Fang
e5aa8333e4
[NFC] update chunk manager API ( #2119 )
2 years ago
Jiarui Fang
e99edfcb51
[NFC] polish comments for Chunk class ( #2116 )
2 years ago
Ziyue Jiang
09d69e1c25
[PP Middleware] Add bwd and step for PP middleware ( #2111 )
...
* add bwd and step for PP middleware
* pre-commit
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Jiarui Fang
8afc001f4f
[Gemini] chunk init use OrderedParamGenerator ( #2110 )
2 years ago
HELSON
63fbba3c19
[zero] add L2 gradient clipping for ZeRO ( #2112 )
...
* [zero] add L2 gradient clipping
* [testing] add MlpModel
* [zero] add unit test for grad clipping
* fix atol
2 years ago
Jiarui Fang
70a8556946
[gemini] get the param visited order during runtime ( #2108 )
2 years ago
Jiarui Fang
61f31c3cf0
[Gemini] NFC, polish search_chunk_configuration ( #2107 )
2 years ago
Jiarui Fang
8e14344ec9
[hotfix] fix a type in ColoInitContext ( #2106 )
2 years ago
Jiarui Fang
05545bfee9
[ColoTensor] throw error when ColoInitContext meets meta parameter. ( #2105 )
2 years ago
YuliangLiu0306
d87baa85d9
[autoparallel] support linear function bias addition ( #2104 )
2 years ago
YuliangLiu0306
0fecbb9e20
[autoparallel] support addbmm computation ( #2102 )
2 years ago
YuliangLiu0306
d3d4630495
[autoparallel] add sum handler ( #2101 )
2 years ago
Ziyue Jiang
e4705ba4e2
[Pipeline Middleware] fix data race in Pipeline Scheduler for DAG ( #2087 )
...
* add DAG test case
* fix datarace by adjusting theposition of lock
* polish code
* fix pytest for middleware
* remove test
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
YuliangLiu0306
b175e6d58e
[autoparallel] add bias addtion function class ( #2098 )
...
* [autoparallel] add bias addtion function class
* polish code
* polish
2 years ago
YuliangLiu0306
3af7e65dea
[autoparallel] complete gpt related module search ( #2097 )
2 years ago
Jiarui Fang
85efb7ac2e
[Gemini] gemini use the runtime memory tracer (RMT) ( #2099 )
2 years ago
Super Daniel
2bf2d1cd3b
[fx] An experimental version of ColoTracer.' ( #2002 )
...
* [fx] add a symbolic_trace api.
* [fx] fix import errors.
* [fx] ColoTracer experimental.
2 years ago
Jiarui Fang
4b055351b0
[Gemini] make RuntimeMemTracer work correctly ( #2096 )
2 years ago
YuliangLiu0306
7f72eb0510
[autoparallel]add embedding handler ( #2089 )
...
* [autoparallel] add embedding handler
* fix bugs
2 years ago
Jiarui Fang
1fca5d79ea
[Gemini] remove GLOBAL_MODEL_DATA_TRACER ( #2091 )
2 years ago
Jiarui Fang
28e55c2530
[Gemini] remove GLOBAL_CUDA_MEM_INFO ( #2090 )
2 years ago
Jiarui Fang
25abae6d7f
[Gemini] use MemStats in Runtime Memory tracer ( #2088 )
2 years ago
Jiarui Fang
33f4412102
[Gemini] use MemStats to store the tracing data. Seperate it from Collector. ( #2084 )
2 years ago
Jiarui Fang
1f99205827
[Gemini] remove static tracer ( #2083 )
2 years ago
YuliangLiu0306
0e9db368ef
[autoparallel] add tensor constructor handler ( #2082 )
2 years ago
YuliangLiu0306
cdf537a648
[autoparallel] add non_split linear strategy ( #2078 )
...
* [autoparallel] add non_split linear stategy
* polish
2 years ago
Boyuan Yao
cf0268da93
[autoparallel] Add F.conv metainfo ( #2069 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
* [autoparallel] add F.linear metainfo generator
* [autoparallel] add binary elementwise metainfo
* [fx] recover profiler
* [autoparallel] fix forward memory calculation
* [autoparallel] modify constants.py
* [autoparallel] remove redundant print
* [autoparallel] add F.conv metainfo
* [autoparallel] linear fix
2 years ago
YuliangLiu0306
f123476666
[autoparallel] complete gpt block searching ( #2065 )
...
* [autoparallel] complete gpt block searching
* fix test
2 years ago
Ziyue Jiang
597cdd3006
[Pipeline Middleware] Adapt scheduler for Topo ( #2066 )
...
* adapt scheduler for Topo
* remoove comment
* fix set input
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Jiarui Fang
b3b89865e2
[Gemini] ParamOpHook -> ColoParamOpHook ( #2080 )
2 years ago
YuliangLiu0306
677e1e20d4
[device] update flatten device mesh usage ( #2079 )
2 years ago
Jiarui Fang
a7adad9ccb
[Gemini] rename hooks related to runtime mem tracer ( #2076 )
2 years ago
Jiarui Fang
223332ff7e
[Gemini] rename ParamTracerWrapper -> RuntimeMemTracer ( #2073 )
2 years ago
Jiarui Fang
9f828ef36f
[Gemini] remove not used MemtracerWrapper ( #2072 )
2 years ago
Boyuan Yao
616da17fab
[autoparallel] add binary elementwise metainfo for auto parallel ( #2058 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
* [autoparallel] add F.linear metainfo generator
* [autoparallel] add binary elementwise metainfo
* [fx] recover profiler
* [autoparallel] fix forward memory calculation
* [autoparallel] modify constants.py
* [autoparallel] remove redundant print
2 years ago
Boyuan Yao
4b40fbd743
[autoparallel] fix forward memory calculation ( #2062 )
2 years ago
Ziyue Jiang
44ea461890
[Pipeline] Add Topo Class ( #2059 )
...
* use Topo class to rewrite DAG
* polish code
* polish code
* polish code
* add comment
* add else to unended if
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
YuliangLiu0306
e4293e5077
[hotfix] update test for latest version ( #2060 )
2 years ago
Zihao
38ea4ba1bd
[Gemini] fix grad unreleased issue and param recovery issue ( #2052 )
2 years ago
YuliangLiu0306
1c1fe44305
[autoparallel] adapt solver with self attention ( #2037 )
...
* [autoparallel] adapt solver with self attention
* polish code
2 years ago
Frank Lee
ea74a3b9cc
[cli] updated installation cheheck with more inforamtion ( #2050 )
...
* [cli] updated installation cheheck with more inforamtion
* polish code
* polish code
2 years ago
HELSON
f6178728a0
[gemini] fix init bugs for modules ( #2047 )
...
* [gemini] fix init bugs for modules
* fix bugs
2 years ago
Frank Lee
81e0da7fa8
[setup] supported conda-installed torch ( #2048 )
...
* [setup] supported conda-installed torch
* polish code
2 years ago
HELSON
e37f3db40c
[gemini] add arguments ( #2046 )
...
* [zero] fix testing parameters
* [gemini] add arguments
* add docstrings
2 years ago
Zihao
6a9158f1fa
[Gemini] free and allocate cuda memory by tensor.storage, add grad hook ( #2040 )
2 years ago
Jiarui Fang
31c644027b
[hotfix] hotfix Gemini for no leaf modules bug ( #2043 )
2 years ago
HELSON
a1ce02d740
[zero] test gradient accumulation ( #1964 )
...
* [zero] fix memory leak for zero2
* [zero] test gradient accumulation
* [zero] remove grad clip test
2 years ago
Ziyue Jiang
b0936e4a44
[rpc] split with dag ( #2028 )
...
* add DAG to split_module
* add comment
* add test case for DAG
* remove print
* add DAG middleware in scheduler
* add test case for scheduler
* remove break
* recover old lifecycle
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Jiarui Fang
96134e7be3
[hotfix] add bert test for gemini fwd bwd ( #2035 )
2 years ago
YuliangLiu0306
0dbcd4a6f5
[autoparallel] add split handler ( #2032 )
...
* [autoparallel] add split handler
* add numerical test and runtime passes
2 years ago
Jiarui Fang
28aa9a4294
[Gemini] more rigorous unit tests for run_fwd_bwd ( #2034 )
2 years ago
YuliangLiu0306
81330b0352
[autoparallel] add experimental permute handler ( #2029 )
2 years ago
Zihao
95c4532fff
[Gemini] paramWrapper paramTracerHook unitest ( #2030 )
2 years ago
Jiarui Fang
8daf1b4db1
[Gemini] patch for supporting orch.add_ function for ColoTensor ( #2003 )
2 years ago
Ziyue Jiang
632753abbc
[fx]Split partition with DAG information ( #2025 )
...
* add DAG to split_module
* add comment
* add test case for DAG
* remove print
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
YuliangLiu0306
ea0f6b8df9
[autoparallel] add runtime pass and numerical test for view handler ( #2018 )
2 years ago
Zihao
a719b89a41
[gemini] param_trace_hook ( #2020 )
2 years ago
Jiarui Fang
0b0d8f9e17
[hotfix] revert bug PRs ( #2016 )
2 years ago
Zihao
aba3db464d
[Gemini] ParamMemHook ( #2008 )
2 years ago
Zihao
0160a62a3c
[Gemini] param_tracer_wrapper and test case ( #2009 )
2 years ago
YuliangLiu0306
1438993113
[autoparallel] add experimental view handler ( #2011 )
...
* [autoparallel] add experimental view handler
* polish
* polish
* polish code
* rename variables
2 years ago
Genghan Zhang
d655eea515
[autoparallel] mix gather ( #1977 )
...
* Add mix-gather
* Add comments
* Add comments
* Polish comments
* Change the global rank assumption
* Add tests
* Add two-step tests
* Fix 10 and 01
* Skip test becasue the number of GPUs
2 years ago
Frank Lee
2bab6f512c
[release] release v0.1.11rc4 ( #2007 )
2 years ago
Boyuan Yao
6cd784ffee
[autoparallel] Add metainfo support for F.linear ( #1987 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
* [autoparallel] add F.linear metainfo generator
2 years ago
Super Daniel
2edbef13cc
[fx] add more meta_registry for MetaTensor execution. ( #2000 )
...
* [sc] add examples for auto checkpoint.
* merge upstream
* [fx] add more meta_registry for MetaTensor execution.
2 years ago
Jiarui Fang
a2d3266648
[hotfix] make Gemini work for conv DNN ( #1998 )
2 years ago