Frank Lee
ed19290560
[booster] implemented mixed precision class ( #3151 )
...
* [booster] implemented mixed precision class
* polish code
2023-03-17 11:00:15 +08:00
YuliangLiu0306
2eca4cd376
[DTensor] refactor dtensor with new components ( #3089 )
...
* [DTensor] refactor dtensor with new components
* polish
2023-03-14 16:25:47 +08:00
ver217
ed8f60b93b
[lazyinit] refactor lazy tensor and lazy init ctx ( #3131 )
...
* [lazyinit] refactor lazy tensor and lazy init ctx
* [lazyinit] polish docstr
* [lazyinit] polish docstr
2023-03-14 15:37:12 +08:00
Frank Lee
95a36eae63
[kernel] added kernel loader to softmax autograd function ( #3093 )
...
* [kernel] added kernel loader to softmax autograd function
* [release] v0.2.6
2023-03-10 14:27:09 +08:00
Super Daniel
fff98f06ed
[analyzer] a minimal implementation of static graph analyzer ( #2852 )
...
* [hotfix] meta tensor default device.
* [siu] add experimental submodules to main branch.
* [siu]
* [siu]
* [analyzer] init.
* [analyzer] readme.
* [analyzer] readme.
* [analyzer] readme.
* [analyzer] readme.
* [test] add test.
* Update symbolic_trace.py
* mark skip tests.
* try except.
* try except.
* try except.
* s
* init
* init
* fix
* skip
* skip
---------
Co-authored-by: Daniel Shao <superdainiu@MININT-PVARVID.fareast.corp.microsoft.com>
Co-authored-by: Daniel Shao <superdainiu@Daniels-Mac.local>
2023-03-10 13:21:05 +08:00
Xuanlei Zhao
10c61de2f7
[autochunk] support vit ( #3084 )
...
support vit for autochunk
* support some new ops for vit
* fix some bugs
* add test for vit
2023-03-10 10:23:26 +08:00
YuliangLiu0306
8e4e8601b7
[DTensor] implement layout converter ( #3055 )
...
* [DTensor] refactor LayoutConverter for DTensor
* polish code
* polish docstring
2023-03-10 09:53:52 +08:00
Frank Lee
f19b49e164
[booster] init module structure and definition ( #3056 )
2023-03-09 11:27:46 +08:00
Xuanlei Zhao
2ca9728cbb
[autochunk] refactor chunk memory estimation ( #2762 )
...
* refact memory code
* dont log free var memory
* add memory align
* update chunk target
* update setting for new memory
* finish test
* update tracer
* update typo
* update test
2023-03-08 16:22:30 +08:00
YuliangLiu0306
29386a54e6
[DTensor] refactor CommSpec ( #3034 )
2023-03-08 10:45:31 +08:00
YuliangLiu0306
cd2b0eaa8d
[DTensor] refactor sharding spec ( #2987 )
...
* [autoparallel] refactor sharding spec
* rename function name
2023-03-07 11:08:11 +08:00
Ziyue Jiang
400f63012e
[pipeline] Add Simplified Alpa DP Partition ( #2507 )
...
* add alpa dp split
* add alpa dp split
* use fwd+bwd instead of fwd only
---------
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-03-07 10:34:31 +08:00
Super Daniel
b42d3d28ed
[fx] remove depreciated algorithms. ( #2312 ) ( #2313 )
2023-03-07 10:30:35 +08:00
github-actions[bot]
82503a96f2
[format] applied code formatting on changed files in pull request 2997 ( #3008 )
...
Co-authored-by: github-actions <github-actions@github.com>
2023-03-06 10:42:22 +08:00
binmakeswell
52a5078988
[doc] add ISC tutorial ( #2997 )
...
* [doc] add ISC tutorial
* [doc] add ISC tutorial
* [doc] add ISC tutorial
* [doc] add ISC tutorial
2023-03-06 10:36:38 +08:00
ver217
823f3b9cf4
[doc] add deepspeed citation and copyright ( #2996 )
...
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
2023-03-04 20:08:11 +08:00
YuliangLiu0306
e414e4092b
[DTensor] implementation of dtensor ( #2946 )
...
* [DTensor] implementation of dtensor
* test layout convert
* polish
2023-03-01 16:34:58 +08:00
YuliangLiu0306
47fb214b3b
[hotfix] add shard dim to aviod backward communication error ( #2954 )
2023-03-01 11:41:53 +08:00
ver217
090f14fd6b
[misc] add reference ( #2930 )
...
* [misc] add reference
* [misc] add license
2023-02-28 18:07:24 +08:00
YuliangLiu0306
197d0bf4ed
[autoparallel] apply repeat block to reduce solving time ( #2912 )
2023-02-28 11:03:30 +08:00
YH
a848091141
Fix port exception type ( #2925 )
2023-02-28 11:00:43 +08:00
zbian
61e687831d
fixed using zero with tp cannot access weight correctly
2023-02-28 10:52:30 +08:00
YH
7b13f7db18
[zero] trivial zero optimizer refactoring ( #2869 )
...
* Fix mionr grad store interface
* Apply lint
2023-02-27 14:04:53 +08:00
Jiatong (Julius) Han
8c8a39be95
[hotfix]: Remove math.prod dependency ( #2837 )
...
* Remove math.prod dependency
* Fix style
* Fix style
---------
Co-authored-by: Jiatong Han <jiatong.han@u.nus.edu>
2023-02-23 23:56:15 +08:00
YuliangLiu0306
819e25d8b1
[hotfix] fix autoparallel compatibility test issues ( #2754 )
2023-02-23 17:28:36 +08:00
YuliangLiu0306
0f392d7403
[autoparallel] find repeat blocks ( #2854 )
...
* [autoparallel] find repeat blocks
* polish
* polish
* polish
2023-02-23 17:28:19 +08:00
junxu
c52edcf0eb
Rename class method of ZeroDDP ( #2692 )
2023-02-22 15:05:53 +08:00
HELSON
6e4ac08172
[hotfix] fix chunk size can not be divided ( #2867 )
...
* [hotfix] fix chunk size can not be divided
* [hotfix] use numpy for python3.8
2023-02-22 15:04:46 +08:00
Boyuan Yao
eae77c831d
[autoparallel] Patch meta information for nodes that will not be handled by SPMD solver ( #2823 )
...
* [autoparallel] non spmd meta information generator
* [autoparallel] patch meta information for non spmd nodes
2023-02-22 10:28:56 +08:00
Boyuan Yao
c7764d3f22
[autoparallel] Patch meta information of `torch.where` ( #2822 )
...
* [autoparallel] patch meta information of torch.where
* [autoparallel] pre-commit modified
2023-02-22 10:28:21 +08:00
Boyuan Yao
fcc4097efa
[autoparallel] Patch meta information of `torch.tanh()` and `torch.nn.Dropout` ( #2773 )
...
* [autoparallel] tanh meta information
* [autoparallel] remove redundant code
* [autoparallel] patch meta information of torch.nn.Dropout
2023-02-22 10:27:59 +08:00
Frank Lee
935346430f
[cli] handled version check exceptions ( #2848 )
...
* [cli] handled version check exceptions
* polish code
2023-02-21 17:04:49 +08:00
Frank Lee
918bc94b6b
[triton] added copyright information for flash attention ( #2835 )
...
* [triton] added copyright information for flash attention
* polish code
2023-02-21 11:25:57 +08:00
Boyuan Yao
7ea6bc7f69
[autoparallel] Patch tensor related operations meta information ( #2789 )
...
* [autoparallel] tensor related meta information prototype
* [autoparallel] tensor related meta information
* [autoparallel] tensor related meta information
* [autoparallel] tensor related meta information
* [autoparallel] tensor related meta information
2023-02-20 17:38:55 +08:00
Michelle
c008d4ad0c
[NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style ( #2744 )
2023-02-20 10:38:40 +08:00
YuliangLiu0306
2059fdd6b0
[hotfix] add copyright for solver and device mesh ( #2803 )
...
* [hotfix] add copyright for solver and device mesh
* add readme
* add alpa license
* polish
2023-02-18 21:14:38 +08:00
Boyuan Yao
8593ae1a3f
[autoparallel] rotor solver refactor ( #2813 )
...
* [autoparallel] rotor solver refactor
* [autoparallel] rotor solver refactor
2023-02-18 11:30:15 +08:00
HELSON
56ddc9ca7a
[hotfix] add correct device for fake_param ( #2796 )
2023-02-17 15:29:07 +08:00
Boyuan Yao
a2b43e393d
[autoparallel] Patch meta information of `torch.nn.Embedding` ( #2760 )
...
* [autoparallel] embedding metainfo
* [autoparallel] fix function name in test_activation_metainfo
* [autoparallel] undo changes in activation metainfo and related tests
2023-02-17 10:39:48 +08:00
Boyuan Yao
8e3f66a0d1
[zero] fix wrong import ( #2777 )
2023-02-17 10:26:07 +08:00
Nikita Shulga
01066152f1
Don't use `torch._six` ( #2775 )
...
* Don't use `torch._six`
This is a private API which is gone after https://github.com/pytorch/pytorch/pull/94709
* Update common.py
2023-02-17 09:22:45 +08:00
binmakeswell
93b788b95a
Merge branch 'main' into fix/format
2023-02-15 20:23:51 +08:00
xyupeng
2fd528b9f4
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style ( #2737 )
2023-02-15 22:57:45 +08:00
YuliangLiu0306
1dc003c169
[autoparallel] distinguish different parallel strategies ( #2699 )
2023-02-15 22:28:28 +08:00
YH
ae86a29e23
Refact method of grad store ( #2687 )
2023-02-15 22:27:58 +08:00
Zirui Zhu
c9e3ee389e
[NFC] polish colossalai/context/process_group_initializer/initializer_2d.py code style ( #2726 )
2023-02-15 22:27:13 +08:00
Zangwei Zheng
1819373e5c
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/batch_norm_handler.py code style ( #2728 )
2023-02-15 22:26:13 +08:00
Wangbo Zhao(黑色枷锁)
8331420520
[NFC] polish colossalai/cli/cli.py code style ( #2734 )
2023-02-15 22:25:28 +08:00
ziyuhuang123
d344313533
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style ( #2725 )
2023-02-15 16:31:40 +08:00
Xue Fuzhao
e81caeb4bc
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/cost_graph.py code style ( #2720 )
...
Co-authored-by: Fuzhao Xue <fuzhao@login2.ls6.tacc.utexas.edu>
2023-02-15 16:12:45 +08:00
yuxuan-lou
51c45c2460
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/where_handler.py code style ( #2723 )
2023-02-15 16:12:24 +08:00
YuliangLiu0306
21d6a48f4d
[autoparallel] add shard option ( #2696 )
...
* [autoparallel] add shard option
* polish
2023-02-15 13:48:28 +08:00
YuliangLiu0306
5b24987fa7
[autoparallel] fix parameters sharding bug ( #2716 )
2023-02-15 12:25:50 +08:00
Ziyue Jiang
4603538ddd
[NFC] posh colossalai/context/process_group_initializer/initializer_sequence.py code style ( #2712 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-02-15 10:53:38 +08:00
YuliangLiu0306
cb2c6a2415
[autoparallel] refactor runtime pass ( #2644 )
...
* [autoparallel] refactor runtime pass
* add unit test
* polish
2023-02-15 10:36:19 +08:00
Zihao
b3d10db5f1
[NFC] polish colossalai/cli/launcher/__init__.py code style ( #2709 )
2023-02-15 09:57:22 +08:00
YuliangLiu0306
0b2a738393
[autoparallel] remove deprecated codes ( #2664 )
2023-02-15 09:54:32 +08:00
YuliangLiu0306
7fa6be49d2
[autoparallel] test compatibility for gemini and auto parallel ( #2700 )
2023-02-15 09:43:29 +08:00
CZYCW
4ac8bfb072
[NFC] polish colossalai/engine/gradient_handler/utils.py code style ( #2708 )
2023-02-15 09:40:08 +08:00
Liu Ziming
6427c406cf
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/strategy_generator.py code style ( #2695 )
...
Co-authored-by: shenggan <csg19971016@gmail.com>
2023-02-14 21:30:25 +08:00
アマデウス
534f68c83c
[NFC] polish pipeline process group code style ( #2694 )
2023-02-14 18:12:01 +08:00
LuGY
56ff1921e9
[NFC] polish colossalai/context/moe_context.py code style ( #2693 )
2023-02-14 18:02:45 +08:00
Shawn-Kong
1712da2800
[NFC] polish colossalai/gemini/gemini_context.py code style ( #2690 )
2023-02-14 11:55:23 +08:00
HELSON
df4f020ee3
[zero1&2] only append parameters with gradients ( #2681 )
2023-02-13 18:00:16 +08:00
ver217
f0aa191f51
[gemini] fix colo_init_context ( #2683 )
2023-02-13 17:53:15 +08:00
Boyuan Yao
40c916b192
[autoparallel] Patch meta information of `torch.nn.functional.softmax` and `torch.nn.Softmax` ( #2674 )
...
* [autoparallel] softmax metainfo
* [autoparallel] softmax metainfo
2023-02-13 16:09:22 +08:00
HELSON
8213f89fd2
[gemini] add fake_release_chunk for keep-gathered chunk in the inference mode ( #2671 )
2023-02-13 14:35:32 +08:00
binmakeswell
9ab14b20b5
[doc] add CVPR tutorial ( #2666 )
2023-02-10 20:43:34 +08:00
Boyuan Yao
0385b26ebf
[autoparallel] Patch meta information of `torch.nn.LayerNorm` ( #2647 )
...
* [autoparallel] layernorm metainfo patch
* [autoparallel] polish test
2023-02-10 14:29:24 +08:00
YuliangLiu0306
37df666f38
[autoparallel] refactor handlers which reshape input tensors ( #2615 )
...
* [autoparallel] refactor handlers which reshape input tensors
* polish
2023-02-08 15:02:49 +08:00
YuliangLiu0306
28398f1c70
add overlap option ( #2613 )
2023-02-08 15:02:31 +08:00
YuliangLiu0306
cb3d1bef62
[autoparallel] adapt autoparallel tests with latest api ( #2626 )
2023-02-08 15:02:12 +08:00
Boyuan Yao
90a9fdd91d
[autoparallel] Patch meta information of `torch.matmul` ( #2584 )
...
* [autoparallel] matmul metainfo
* [auto_parallel] remove unused print
* [tests] skip test_matmul_handler when torch version is lower than 1.12.0
2023-02-08 11:05:31 +08:00
oahzxl
6ba8364881
[autochunk] support diffusion for autochunk ( #2621 )
...
* add alphafold benchmark
* renae alphafold test
* rename tests
* rename diffuser
* renme
* rename
* update transformer
* update benchmark
* update benchmark
* update bench memory
* update transformer benchmark
* rename
* support diffuser
* support unet metainfo prop
* fix bug and simplify code
* update linear and support some op
* optimize max region search, support conv
* update unet test
* support some op
* support groupnorm and interpolate
* update flow search
* add fix dim in node flow
* fix utils
* rename
* support diffusion
* update diffuser
* update chunk search
* optimize imports
* import
* finish autochunk
2023-02-07 16:32:45 +08:00
Frank Lee
8518263b80
[test] fixed the triton version for testing ( #2608 )
2023-02-07 13:49:38 +08:00
HELSON
552183bb74
[polish] polish ColoTensor and its submodules ( #2537 )
2023-02-03 11:44:10 +08:00
Frank Lee
dd14783f75
[kernel] fixed repeated loading of kernels ( #2549 )
...
* [kernel] fixed repeated loading of kernels
* polish code
* polish code
2023-02-03 09:47:13 +08:00
ver217
5b1854309a
[hotfix] fix zero ddp warmup check ( #2545 )
2023-02-02 16:42:38 +08:00
oahzxl
fa3d66feb9
support unet metainfo prop ( #2544 )
2023-02-02 16:19:26 +08:00
oahzxl
05671fcb42
[autochunk] support multi outputs chunk search ( #2538 )
...
Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy.
1. rewrite search strategy to support multi outputs chunk search
2. fix many, many bugs
3. update tests
2023-02-01 13:18:51 +08:00
oahzxl
63199c6687
[autochunk] support transformer ( #2526 )
2023-01-31 16:00:06 +08:00
HELSON
a4ed9125ac
[hotfix] fix lightning error ( #2529 )
2023-01-31 10:40:39 +08:00
HELSON
66dfcf5281
[gemini] update the gpt example ( #2527 )
2023-01-30 17:58:05 +08:00
HELSON
b528eea0f0
[zero] add zero wrappers ( #2523 )
...
* [zero] add zero wrappers
* change names
* add wrapper functions to init
2023-01-29 17:52:58 +08:00
Super Daniel
c198c7c0b0
[hotfix] meta tensor default device. ( #2510 )
2023-01-29 16:28:10 +08:00
HELSON
077a5cdde4
[zero] fix gradient clipping in hybrid parallelism ( #2521 )
...
* [zero] fix gradient clipping in hybrid parallelism
* [testing] change model name to avoid pytest warning
* [hotfix] fix unit testing
2023-01-29 15:09:57 +08:00
YuliangLiu0306
aa0f6686f9
[autoparallel] accelerate gpt2 training ( #2495 )
2023-01-29 11:13:15 +08:00
HELSON
707b11d4a0
[gemini] update ddp strict mode ( #2518 )
...
* [zero] add strict ddp mode for chunk init
* [gemini] update gpt example
2023-01-28 14:35:25 +08:00
HELSON
2d1a7dfe5f
[zero] add strict ddp mode ( #2508 )
...
* [zero] add strict ddp mode
* [polish] add comments for strict ddp mode
* [zero] fix test error
2023-01-20 14:04:38 +08:00
oahzxl
c04f183237
[autochunk] support parsing blocks ( #2506 )
2023-01-20 11:18:17 +08:00
Super Daniel
35c0c0006e
[utils] lazy init. ( #2148 )
...
* [utils] lazy init.
* [utils] remove description.
* [utils] complete.
* [utils] finalize.
* [utils] fix names.
2023-01-20 10:49:00 +08:00
oahzxl
72341e65f4
[auto-chunk] support extramsa ( #3 ) ( #2504 )
2023-01-20 10:13:03 +08:00
Ziyue Jiang
0f02b8c6e6
add avg partition ( #2483 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-19 13:54:50 +08:00
アマデウス
99d9713b02
Revert "Update parallel_context.py ( #2408 )"
...
This reverts commit 7d5640b9db
.
2023-01-19 12:27:48 +08:00
oahzxl
ecccc91f21
[autochunk] support autochunk on evoformer ( #2497 )
2023-01-19 11:41:00 +08:00
oahzxl
5db3a5bf42
[fx] allow control of ckpt_codegen init ( #2498 )
...
* [fx] allow control of ckpt_codegen init
Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so.
So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__.
* code style
2023-01-18 17:02:46 +08:00
HELSON
d565a24849
[zero] add unit testings for hybrid parallelism ( #2486 )
2023-01-18 10:36:10 +08:00
oahzxl
4953b4ace1
[autochunk] support evoformer tracer ( #2485 )
...
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2023-01-16 19:25:05 +08:00
YuliangLiu0306
67e1912b59
[autoparallel] support origin activation ckpt on autoprallel system ( #2468 )
2023-01-16 16:25:13 +08:00
Ziyue Jiang
fef5c949c3
polish pp middleware ( #2476 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-13 16:56:01 +08:00