Jiang Zhuo
|
5a4a3b77d9
|
fix format (#376)
|
2022-03-11 15:50:28 +08:00 |
lucasliunju
|
ce886a9062
|
fix format (#374)
|
2022-03-11 15:50:28 +08:00 |
Frank Lee
|
526a318032
|
[unit test] Refactored test cases with component func (#339)
* refactored test with component func
* fixed bug
|
2022-03-11 15:50:28 +08:00 |
LuGY
|
de46450461
|
Added activation offload (#331)
* Added activation offload
* Fixed the import bug, used the pytest
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
272ebfb57d
|
[bug] shard param during initializing the ShardedModelV2 (#381)
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
8c18eb0998
|
[profiler] Fixed bugs in CommProfiler and PcieProfiler (#377)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
b5f43acee3
|
[zero] find miss code (#378)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
6b6002962a
|
[zero] zero init context collect numel of model (#375)
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
1ed7c24c02
|
Added PCIE profiler to dectect data transmission (#373)
|
2022-03-11 15:50:28 +08:00 |
jiaruifang
|
d9217e1960
|
Revert "[zero] bucketized tensor cpu gpu copy (#368)"
This reverts commit bef05489b6 .
|
2022-03-11 15:50:28 +08:00 |
Xue Fuzhao
|
a8cd5e8e81
|
Update README-zh-Hans.md (#367)
Fuzhao updated
|
2022-03-11 15:50:28 +08:00 |
Shen Chenhui
|
1c88dd43e2
|
Fix/format (#366)
|
2022-03-11 15:50:28 +08:00 |
Ziheng Qin
|
0db43fa995
|
fix format (#364)
|
2022-03-11 15:50:28 +08:00 |
RichardoLuo
|
8539898ec6
|
flake8 style change (#363)
|
2022-03-11 15:50:28 +08:00 |
Kai Wang (Victor Kai)
|
53bb3bcc0a
|
fix format (#362)
|
2022-03-11 15:50:28 +08:00 |
ziyu huang
|
a77d73f22b
|
fix format parallel_context.py (#359)
Co-authored-by: huangziyu <202476410arsmart@gmail.com>
|
2022-03-11 15:50:28 +08:00 |
Zangwei
|
c695369af0
|
fix format constants.py (#358)
|
2022-03-11 15:50:28 +08:00 |
Yuer867
|
4a0f8c2c50
|
fix format parallel_2p5d (#357)
|
2022-03-11 15:50:28 +08:00 |
Liang Bowen
|
7eb87f516d
|
flake8 style (#352)
|
2022-03-11 15:50:28 +08:00 |
Xu Kai
|
54ee8d1254
|
Fix/format colossalai/engine/paramhooks/(#350)
|
2022-03-11 15:50:28 +08:00 |
Maruyama_Aya
|
e83970e3dc
|
fix format ColossalAI\colossalai\context\process_group_initializer
|
2022-03-11 15:50:28 +08:00 |
yuxuan-lou
|
3b88eb2259
|
Flake8 code restyle
|
2022-03-11 15:50:28 +08:00 |
xyupeng
|
af801cb4df
|
fix format setup.py (#343)
|
2022-03-11 15:50:28 +08:00 |
xuqifan897
|
148207048e
|
Qifan formated file ColossalAI\colossalai\nn\layer\parallel_1d\layers.py (#342)
|
2022-03-11 15:50:28 +08:00 |
Cautiousss
|
3a51d909af
|
fix format (#332)
Co-authored-by: 何晓昕 <cautious@r-205-106-25-172.comp.nus.edu.sg>
|
2022-03-11 15:50:28 +08:00 |
DouJS
|
cbb6436ff0
|
fix format for dir-[parallel_3d] (#333)
|
2022-03-11 15:50:28 +08:00 |
ExtremeViscent
|
eaac03ae1d
|
[formart] format fixed for kernel\cuda_native codes (#335)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
00670c870e
|
[zero] bucketized tensor cpu gpu copy (#368)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
44e4891f57
|
[zero] able to place params on cpu after zero init context (#365)
* place params on cpu after zero init context
* polish code
|
2022-03-11 15:50:28 +08:00 |
ver217
|
b66f3b994c
|
increase the timeout limit in CI temporarily
|
2022-03-11 15:50:28 +08:00 |
ver217
|
52d055119b
|
increase the timeout limit in CI temporarily
|
2022-03-11 15:50:28 +08:00 |
ver217
|
253e54d98a
|
fix grad shape
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
ea2872073f
|
[zero] global model data memory tracer (#360)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
cb34cd384d
|
[test] polish zero related unitest (#351)
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
534e0bb118
|
Fixed import bug for no-tensorboard environment (#354)
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
c57e089824
|
[profile] added example for ProfilerContext (#349)
|
2022-03-11 15:50:28 +08:00 |
ver217
|
532ae79cb0
|
add test sharded optim with cpu adam (#347)
|
2022-03-11 15:50:28 +08:00 |
Jiarui Fang
|
10e2826426
|
move async memory to an individual directory (#345)
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
425bb0df3f
|
Added Profiler Context to manage all profilers (#340)
|
2022-03-11 15:50:28 +08:00 |
ver217
|
d0ae0f2215
|
[zero] update sharded optim v2 (#334)
|
2022-03-11 15:50:28 +08:00 |
ver217
|
2b8cddd40e
|
skip bert in test engine
|
2022-03-11 15:50:28 +08:00 |
ver217
|
d41a9f12c6
|
install transformers in CI
|
2022-03-11 15:50:28 +08:00 |
ver217
|
f5f0ad266e
|
fix bert unit test
|
2022-03-11 15:50:28 +08:00 |
jiaruifang
|
5663616921
|
polish code
|
2022-03-11 15:50:28 +08:00 |
jiaruifang
|
d271f2596b
|
polish engine unitest
|
2022-03-11 15:50:28 +08:00 |
jiaruifang
|
354c0f9047
|
polish code
|
2022-03-11 15:50:28 +08:00 |
jiaruifang
|
4d94cd513e
|
adapting bert unitest interface
|
2022-03-11 15:50:28 +08:00 |
jiaruifang
|
7977422aeb
|
add bert for unitest and sharded model is not able to pass the bert case
|
2022-03-11 15:50:28 +08:00 |
Frank Lee
|
3d5d64bd10
|
refactored grad scaler (#338)
|
2022-03-11 15:50:28 +08:00 |
Frank Lee
|
6a3188167c
|
set criterion as optional in colossalai initialize (#336)
|
2022-03-11 15:50:28 +08:00 |