CsRic
f3403ff98e
[embeddings] add already_split_along_rank flag for tablewise mode ( #1584 )
2022-09-13 10:50:34 +08:00
github-actions[bot]
77399dc91b
Automated submodule synchronization ( #1550 )
...
Co-authored-by: github-actions <github-actions@github.com>
2022-09-13 10:03:33 +08:00
Boyuan Yao
f3687e4ee2
[fx] Add nested checkpoint in activation checkpoint codegen ( #1585 )
...
* [fx] add nested activation_checkpoint codegen
* undo algorithms commits
* solver
* undo some commits
* [fx] torch11 add nested activation checkpoint codegen
* remove some imports
* [fx] add some comments in activation codegen
* [fx] codegen instance error fix
2022-09-12 20:00:48 +08:00
binmakeswell
1c9ec32734
[NFC] add OPT serving ( #1581 )
2022-09-09 16:56:45 +08:00
Boyuan Yao
20e466527b
[NFC] polish ./colossalai/trainer/hooks/_lr_scheduler_hook.py code style ( #1576 )
2022-09-08 22:11:04 +08:00
Fazzie-Maqianli
06dccdde44
[NFC] polish colossalai/zero/sharded_model/reduce_scatter.py code style ( #1554 )
2022-09-08 22:11:04 +08:00
CsRic
2ac46f7be4
[NFC] polish utils/tensor_detector/__init__.py code style ( #1573 )
...
Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>
2022-09-08 22:11:04 +08:00
Sze-qq
2144cbae8c
[NFC] polish colossalai/nn/lr_scheduler/multistep.py code style ( #1572 )
2022-09-08 22:11:04 +08:00
superhao1995
e4bf7ae667
[NFC] polish colossalai/nn/lr_scheduler/torch.py code style ( #1571 )
...
Co-authored-by: Research <research@soccf-snr3-017.comp.nus.edu.sg>
2022-09-08 22:11:04 +08:00
Jiatong Han
3263cdf57f
[NFC] polish colossalai/nn/parallel/data_parallel.py code style ( #1570 )
...
Co-authored-by: JThh <jiatong.han@u.nus.edu>
2022-09-08 22:11:04 +08:00
Zirui Zhu
f566c9b98d
[NFC] polish colossalai/pipeline/utils.py code style ( #1562 )
2022-09-08 22:11:04 +08:00
Xue Fuzhao
e070ca45c6
[NFC] polish colossalai/fx/tracer/meta_patch/patched_module/convolution.py code style ( #1563 )
2022-09-08 22:11:04 +08:00
Zangwei Zheng
9823cbf24b
[NFC] polish colossalai/gemini/update/chunkv2.py code style ( #1565 )
2022-09-08 22:11:04 +08:00
DouJS
f586887a90
[NFC] polish colossalai/nn/layer/colossalai_layer/dropout.py code style ( #1568 )
2022-09-08 22:11:04 +08:00
LuGY
c7d4932956
[NFC] polish colossalai/utils/tensor_detector/tensor_detector.py code style ( #1566 )
2022-09-08 22:11:04 +08:00
BigOneLiXiaoMing
0c4c9aa6e0
[NFC] polish colossalai/nn/_ops/embedding.py code style ( #1561 )
2022-09-08 22:11:04 +08:00
Ziheng Qin
08815f0e72
[NFC] polish colossalai/builder/__init__.py code style ( #1560 )
...
Co-authored-by: henryqin1997 <henryqin1997@gamil.com>
2022-09-08 22:11:04 +08:00
Super Daniel
8328917348
[NFC] polish colossalai/testing/comparison.py code style. ( #1558 )
2022-09-08 22:11:04 +08:00
Ofey Chan
7cc052f6c0
[NFC] polish colossalai/nn/layer/colossalai_layer/linear.py ( #1556 )
2022-09-08 22:11:04 +08:00
Kai Wang (Victor Kai)
46931e3c32
[NFC] polish code colossalai/gemini/update/search_utils.py ( #1557 )
2022-09-08 22:11:04 +08:00
yuxuan-lou
413f9c19f4
[NFC] polish colossalai/nn/_ops/layernorm.py code style ( #1555 )
2022-09-08 22:11:04 +08:00
shenggan
8edb777cc2
[NFC] polish colossalai/nn/loss/loss_2p5d.py code style ( #1553 )
2022-09-08 22:11:04 +08:00
Maruyama_Aya
bd2d789832
[NFC] polish colossalai/nn/_ops/embedding_bag.py code style ( #1552 )
2022-09-08 22:11:04 +08:00
binmakeswell
73e9eb13b7
[NFC] polish colossalai/nn/lr_scheduler/cosine.py code style
2022-09-08 22:11:04 +08:00
Kirigaya Kazuto
318fbf1145
[NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style ( #1559 )
2022-09-08 22:04:34 +08:00
ver217
b0f4c0bddf
update version ( #1574 )
2022-09-08 16:47:07 +08:00
CsRic
a389ac4ec9
[embedding] cache_embedding small improvement ( #1564 )
2022-09-08 16:41:19 +08:00
ver217
10dd8226b1
add gather_output for VocabParallelClassifier1D ( #1569 )
2022-09-08 16:40:56 +08:00
アマデウス
e615cfc3a8
[NFC] polish test component gpt code style ( #1567 )
2022-09-08 16:34:09 +08:00
Kirigaya Kazuto
6159d45417
[pipeline/tuning] improve dispatch performance both time and space cost ( #1544 )
2022-09-07 19:01:06 +08:00
Super Daniel
4f59693207
[fx] provide a stable but not accurate enough version of profiler. ( #1547 )
...
* [fx] compute memory stat and flop count for MetaInfoProp.
* [fx] modify node attribute.
* [fx] modify ckpt_chen.
* [fx] fix compatibility.
* [fx] fix import error.
* [fx] skip test for MetaInfoProp.
* [fx] skip test for MetaInfoProp.
* [fx] skip test for MetaInfoProp.
* [fx] skip test for MetaInfoProp.
* [fx] skip if torch 1.11.0.
* [fx] recover MetaInfoProp support for PyTorch 1.11.
* [fx] provide a stable but not accurate enough version of profiler.
* [fx] provide a stable but not accurate enough version of profiler.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix compatibility in tests.
* [fx] fix import error.
2022-09-07 11:21:04 +08:00
github-actions[bot]
7d49e7b2db
Automated submodule synchronization ( #1534 )
...
Co-authored-by: github-actions <github-actions@github.com>
2022-09-07 11:20:01 +08:00
YuliangLiu0306
0908d0fc61
[autoparallel]add backward cost info into strategies ( #1524 )
2022-09-07 11:19:00 +08:00
YuliangLiu0306
1a3599410d
[autoparallel] support fucntion in operator handler ( #1529 )
2022-09-07 11:18:41 +08:00
YuliangLiu0306
44c866a3e3
[autoparallel] change the merge node logic ( #1533 )
2022-09-07 11:18:19 +08:00
ver217
ae71036cd2
[utils] refactor parallel layers checkpoint and bcast model on loading checkpoint ( #1548 )
...
* refactor parallel layer
* broadcast rank0 model after load ckpt
2022-09-06 20:18:35 +08:00
ver217
2bed096848
[utils] optimize partition_tensor_parallel_state_dict ( #1546 )
2022-09-06 17:45:31 +08:00
Super Daniel
d8a5aded19
[hotfix] change namespace for meta_trace. ( #1541 )
2022-09-06 11:46:12 +08:00
ver217
a203b709d5
[hotfix] fix init context ( #1543 )
...
* fix init context
* fix lazy init ctx
2022-09-06 11:45:08 +08:00
Jiarui Fang
64169f3e8f
[embedding] polish parallel embedding tablewise ( #1545 )
2022-09-06 10:41:20 +08:00
Boyuan Yao
46c6cc79a9
[fx] Add common node in model linearize ( #1542 )
...
* [fx] Add common node into linearize
* [fx] Add common node to solver
2022-09-05 18:35:05 +08:00
CsRic
964123ae0f
[embedding] freq_aware_embedding: add small functions for caller application ( #1537 )
2022-09-05 15:12:53 +08:00
Super Daniel
70129603aa
[fx] support meta tracing for aten level computation graphs like functorch. ( #1536 )
...
* [fx] support meta tracing for aten level computation graphs like functorch.
* [fx] support meta tracing for aten level computation graphs like functorch.
* [fx] remove redundant import.
* [fx] add docstring.
2022-09-05 12:10:09 +08:00
Jiarui Fang
521078ffc9
[embedding] fix a bug in table wise sharding ( #1538 )
2022-09-02 15:48:35 +08:00
Jiarui Fang
87134524fd
[embedding] tablewise sharding polish ( #1535 )
2022-09-02 11:09:37 +08:00
Boyuan Yao
56159049e8
[fx] Modify solver linearize and add corresponding test ( #1531 )
...
* [fx] modify solver linearize and add test
* [fx] add torch11 test of linearize but skip it
* [fx] remove some unused imports
2022-09-02 10:24:41 +08:00
Super Daniel
7dc53237c3
[fx] add test for meta tensor. ( #1527 )
...
* [fx] add test for meta tensor.
* [fx] add test for meta tensor.
* [fx] add test for meta tensor.
* [fx] add test for meta tensor.
* [fx] fix error.
2022-09-01 19:30:05 +08:00
YuliangLiu0306
4b3d6caeb3
[fx]patch nn.functional convolution ( #1528 )
2022-09-01 19:05:07 +08:00
CsRic
5156d5b4f8
[embedding] add tablewise sharding for FAW ( #1526 )
2022-09-01 17:55:41 +08:00
Kirigaya Kazuto
f1e1836218
[pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP ( #1508 )
...
* support p2p communication with any type of object | pass test
* reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test
* [engin/schedule] use p2p_v2 to recontruct pipeline_schedule
* [pipeline/rpc] implement a demo for PP with cuda rpc framework
* [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B
* [pipeline/rpc] implement distributed optimizer | test with assert_close
* [pipeline/rpc] implement distributed optimizer | test with assert_close
* [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
* [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
* [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
* [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP
* [pipeline/pipleline_process_group] remove comment
* [pipeline/pipleline_process_group] remove comment
* [pipeline/pipleline_process_group] skip process group test
* [pipeline/pipleline_process_group] remove test named function
2022-09-01 17:45:47 +08:00