jiaruifang
8208fd023a
Merge branch 'main' of https://github.com/hpcaitech/ColossalAI into dev0116
2023-01-18 11:32:29 +08:00
HELSON
d565a24849
[zero] add unit testings for hybrid parallelism ( #2486 )
2023-01-18 10:36:10 +08:00
binmakeswell
fcc6d61d92
[example] fix requirements ( #2488 )
2023-01-17 13:07:25 +08:00
oahzxl
4953b4ace1
[autochunk] support evoformer tracer ( #2485 )
...
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2023-01-16 19:25:05 +08:00
YuliangLiu0306
67e1912b59
[autoparallel] support origin activation ckpt on autoprallel system ( #2468 )
2023-01-16 16:25:13 +08:00
Jiarui Fang
3a21485ead
[example] titans for gpt ( #2484 )
2023-01-16 15:55:41 +08:00
jiaruifang
438ea608f3
update readme
2023-01-16 15:54:36 +08:00
jiaruifang
38424db6ff
polish code
2023-01-16 15:21:22 +08:00
jiaruifang
92f65fbbe3
remove license
2023-01-16 15:18:49 +08:00
jiaruifang
315e1433ce
polish readme
2023-01-16 15:17:27 +08:00
jiaruifang
37baea20cb
[example] titans for gpt
2023-01-16 14:59:25 +08:00
jiaruifang
236b4195ff
Merge branch 'main' of https://github.com/hpcaitech/ColossalAI into dev0116
2023-01-16 14:45:14 +08:00
jiaruifang
e64a05b38b
polish code
2023-01-16 14:45:06 +08:00
Jiarui Fang
7c31706227
[CI] add test_ci.sh for palm, opt and gpt ( #2475 )
2023-01-16 14:44:29 +08:00
Jiarui Fang
e4c38ba367
[example] stable diffusion add roadmap ( #2482 )
2023-01-16 12:14:49 +08:00
jiaruifang
9cba38b492
add dummy test_ci.sh
2023-01-16 12:03:48 +08:00
jiaruifang
f78bad21ed
[example] stable diffusion add roadmap
2023-01-16 11:34:26 +08:00
Frank Lee
579dba572f
[workflow] fixed the skip condition of example weekly check workflow ( #2481 )
2023-01-16 10:05:41 +08:00
HELSON
21c88220ce
[zero] add unit test for low-level zero init ( #2474 )
2023-01-15 10:42:01 +08:00
ver217
f525d1f528
[example] update gpt gemini example ci test ( #2477 )
2023-01-13 22:37:31 +08:00
Ziyue Jiang
fef5c949c3
polish pp middleware ( #2476 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-13 16:56:01 +08:00
HELSON
a5dc4253c6
[zero] polish low level optimizer ( #2473 )
2023-01-13 14:56:17 +08:00
Frank Lee
8b7495dd54
[example] integrate seq-parallel tutorial with CI ( #2463 )
2023-01-13 14:40:05 +08:00
ver217
8e85d2440a
[example] update vit ci script ( #2469 )
...
* [example] update vit ci script
* [example] update requirements
* [example] update requirements
2023-01-13 13:31:27 +08:00
Jiarui Fang
867c8c2d3a
[zero] low level optim supports ProcessGroup ( #2464 )
2023-01-13 10:05:58 +08:00
Frank Lee
e6943e2d11
[example] integrate autoparallel demo with CI ( #2466 )
...
* [example] integrate autoparallel demo with CI
* polish code
* polish code
* polish code
* polish code
2023-01-12 16:26:42 +08:00
Frank Lee
14d9299360
[cli] fixed hostname mismatch error ( #2465 )
2023-01-12 14:52:09 +08:00
YuliangLiu0306
c20529fe78
[examples] update autoparallel tutorial demo ( #2449 )
...
* [examples] update autoparallel tutorial demo
* add test_ci.sh
* polish
* add conda yaml
2023-01-12 14:30:58 +08:00
Haofan Wang
9358262992
Fix False warning in initialize.py ( #2456 )
...
* Update initialize.py
* pre-commit run check
2023-01-12 13:49:01 +08:00
Frank Lee
32c46e146e
[workflow] automated bdist wheel build ( #2459 )
...
* [workflow] automated bdist wheel build
* polish workflow
* polish readme
* polish readme
2023-01-12 10:57:02 +08:00
YuliangLiu0306
8221fd7485
[autoparallel] update binary elementwise handler ( #2451 )
...
* [autoparallel] update binary elementwise handler
* polish
2023-01-12 09:35:10 +08:00
Frank Lee
c9ec5190a0
[workflow] automated the compatiblity test ( #2453 )
...
* [workflow] automated the compatiblity test
* polish code
2023-01-11 23:40:16 +08:00
Frank Lee
483efdabc5
[workflow] fixed the on-merge condition check ( #2452 )
2023-01-11 17:22:11 +08:00
Haofan Wang
cfd1d5ee49
[example] fixed seed error in train_dreambooth_colossalai.py ( #2445 )
2023-01-11 16:56:15 +08:00
Frank Lee
ac18a445fa
[example] updated large-batch optimizer tutorial ( #2448 )
...
* [example] updated large-batch optimizer tutorial
* polish code
* polish code
2023-01-11 16:27:31 +08:00
HELSON
2bfeb24308
[zero] add warning for ignored parameters ( #2446 )
2023-01-11 15:30:09 +08:00
Frank Lee
39163417a1
[example] updated the hybrid parallel tutorial ( #2444 )
...
* [example] updated the hybrid parallel tutorial
* polish code
2023-01-11 15:17:17 +08:00
HELSON
5521af7877
[zero] fix state_dict and load_state_dict for ddp ignored parameters ( #2443 )
...
* [ddp] add is_ddp_ignored
[ddp] rename to is_ddp_ignored
* [zero] fix state_dict and load_state_dict
* fix bugs
* [zero] update unit test for ZeroDDP
2023-01-11 14:55:41 +08:00
YuliangLiu0306
2731531bc2
[autoparallel] integrate device mesh initialization into autoparallelize ( #2393 )
...
* [autoparallel] integrate device mesh initialization into autoparallelize
* add megatron solution
* update gpt autoparallel examples with latest api
* adapt beta value to fit the current computation cost
2023-01-11 14:03:49 +08:00
Frank Lee
c72c827e95
[cli] provided more details if colossalai run fail ( #2442 )
2023-01-11 13:56:42 +08:00
Super Daniel
c41e59e5ad
[fx] allow native ckpt trace and codegen. ( #2438 )
2023-01-11 13:49:59 +08:00
YuliangLiu0306
41429b9b28
[autoparallel] add shard option ( #2423 )
2023-01-11 13:40:33 +08:00
Frank Lee
1b7587d958
[workflow] make test coverage report collapsable ( #2436 )
2023-01-11 13:37:48 +08:00
HELSON
7829aa094e
[ddp] add is_ddp_ignored ( #2434 )
...
[ddp] rename to is_ddp_ignored
2023-01-11 12:22:45 +08:00
Frank Lee
a3e5496156
[example] improved the clarity yof the example readme ( #2427 )
...
* [example] improved the clarity yof the example readme
* polish workflow
* polish workflow
* polish workflow
* polish workflow
* polish workflow
* polish workflow
2023-01-11 10:46:32 +08:00
Frank Lee
21256674e9
[workflow] report test coverage even if below threshold ( #2431 )
2023-01-11 10:44:52 +08:00
HELSON
bb4e9a311a
[zero] add inference mode and its unit test ( #2418 )
2023-01-11 10:07:37 +08:00
Frank Lee
63be79d505
[example] removed duplicated stable diffusion example ( #2424 )
2023-01-11 10:07:18 +08:00
Frank Lee
cd38167c1a
[doc] added documentation for CI/CD ( #2420 )
...
* [doc] added documentation for CI/CD
* polish markdown
* polish markdown
* polish markdown
2023-01-10 22:30:32 +08:00
Frank Lee
b3472d32e0
[workflow]auto comment with test coverage report ( #2419 )
...
* [workflow]auto comment with test coverage report
* polish code
* polish yaml
2023-01-10 22:30:16 +08:00