Commit Graph

2394 Commits (41fb7236aa32c307e83b0b9cc50ce2a6da279343)

Author SHA1 Message Date
Jiarui Fang e327e95144
[hotfix] gpt example titans bug #2493 (#2494) 2023-01-18 12:04:18 +08:00
jiaruifang e58cc441e2 polish code and fix dataloader bugs 2023-01-18 12:00:08 +08:00
jiaruifang a4b75b78a0 [hotfix] gpt example titans bug #2493 2023-01-18 11:37:16 +08:00
jiaruifang 8208fd023a Merge branch 'main' of https://github.com/hpcaitech/ColossalAI into dev0116 2023-01-18 11:32:29 +08:00
HELSON d565a24849
[zero] add unit testings for hybrid parallelism (#2486) 2023-01-18 10:36:10 +08:00
binmakeswell fcc6d61d92
[example] fix requirements (#2488) 2023-01-17 13:07:25 +08:00
oahzxl 4953b4ace1
[autochunk] support evoformer tracer (#2485)
support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it.
1. support some evoformer's op in fx
2. support evoformer test
3. add repos for test code
2023-01-16 19:25:05 +08:00
YuliangLiu0306 67e1912b59
[autoparallel] support origin activation ckpt on autoprallel system (#2468) 2023-01-16 16:25:13 +08:00
Jiarui Fang 3a21485ead
[example] titans for gpt (#2484) 2023-01-16 15:55:41 +08:00
jiaruifang 438ea608f3 update readme 2023-01-16 15:54:36 +08:00
jiaruifang 38424db6ff polish code 2023-01-16 15:21:22 +08:00
jiaruifang 92f65fbbe3 remove license 2023-01-16 15:18:49 +08:00
jiaruifang 315e1433ce polish readme 2023-01-16 15:17:27 +08:00
jiaruifang 37baea20cb [example] titans for gpt 2023-01-16 14:59:25 +08:00
jiaruifang 236b4195ff Merge branch 'main' of https://github.com/hpcaitech/ColossalAI into dev0116 2023-01-16 14:45:14 +08:00
jiaruifang e64a05b38b polish code 2023-01-16 14:45:06 +08:00
Jiarui Fang 7c31706227
[CI] add test_ci.sh for palm, opt and gpt (#2475) 2023-01-16 14:44:29 +08:00
Jiarui Fang e4c38ba367
[example] stable diffusion add roadmap (#2482) 2023-01-16 12:14:49 +08:00
jiaruifang 9cba38b492 add dummy test_ci.sh 2023-01-16 12:03:48 +08:00
jiaruifang f78bad21ed [example] stable diffusion add roadmap 2023-01-16 11:34:26 +08:00
Frank Lee 579dba572f
[workflow] fixed the skip condition of example weekly check workflow (#2481) 2023-01-16 10:05:41 +08:00
HELSON 21c88220ce
[zero] add unit test for low-level zero init (#2474) 2023-01-15 10:42:01 +08:00
ver217 f525d1f528
[example] update gpt gemini example ci test (#2477) 2023-01-13 22:37:31 +08:00
Ziyue Jiang fef5c949c3
polish pp middleware (#2476)
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2023-01-13 16:56:01 +08:00
HELSON a5dc4253c6
[zero] polish low level optimizer (#2473) 2023-01-13 14:56:17 +08:00
Frank Lee 8b7495dd54
[example] integrate seq-parallel tutorial with CI (#2463) 2023-01-13 14:40:05 +08:00
ver217 8e85d2440a
[example] update vit ci script (#2469)
* [example] update vit ci script

* [example] update requirements

* [example] update requirements
2023-01-13 13:31:27 +08:00
Jiarui Fang 867c8c2d3a
[zero] low level optim supports ProcessGroup (#2464) 2023-01-13 10:05:58 +08:00
Frank Lee e6943e2d11
[example] integrate autoparallel demo with CI (#2466)
* [example] integrate autoparallel demo with CI

* polish code

* polish code

* polish code

* polish code
2023-01-12 16:26:42 +08:00
Frank Lee 14d9299360
[cli] fixed hostname mismatch error (#2465) 2023-01-12 14:52:09 +08:00
YuliangLiu0306 c20529fe78
[examples] update autoparallel tutorial demo (#2449)
* [examples] update autoparallel tutorial demo

* add test_ci.sh

* polish

* add conda yaml
2023-01-12 14:30:58 +08:00
Haofan Wang 9358262992
Fix False warning in initialize.py (#2456)
* Update initialize.py

* pre-commit run check
2023-01-12 13:49:01 +08:00
Frank Lee 32c46e146e
[workflow] automated bdist wheel build (#2459)
* [workflow] automated bdist wheel build

* polish workflow

* polish readme

* polish readme
2023-01-12 10:57:02 +08:00
YuliangLiu0306 8221fd7485
[autoparallel] update binary elementwise handler (#2451)
* [autoparallel] update binary elementwise handler

* polish
2023-01-12 09:35:10 +08:00
Frank Lee c9ec5190a0
[workflow] automated the compatiblity test (#2453)
* [workflow] automated the compatiblity test

* polish code
2023-01-11 23:40:16 +08:00
Frank Lee 483efdabc5
[workflow] fixed the on-merge condition check (#2452) 2023-01-11 17:22:11 +08:00
Haofan Wang cfd1d5ee49
[example] fixed seed error in train_dreambooth_colossalai.py (#2445) 2023-01-11 16:56:15 +08:00
Frank Lee ac18a445fa
[example] updated large-batch optimizer tutorial (#2448)
* [example] updated large-batch optimizer tutorial

* polish code

* polish code
2023-01-11 16:27:31 +08:00
HELSON 2bfeb24308
[zero] add warning for ignored parameters (#2446) 2023-01-11 15:30:09 +08:00
Frank Lee 39163417a1
[example] updated the hybrid parallel tutorial (#2444)
* [example] updated the hybrid parallel tutorial

* polish code
2023-01-11 15:17:17 +08:00
HELSON 5521af7877
[zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443)
* [ddp] add is_ddp_ignored

[ddp] rename to is_ddp_ignored

* [zero] fix state_dict and load_state_dict

* fix bugs

* [zero] update unit test for ZeroDDP
2023-01-11 14:55:41 +08:00
YuliangLiu0306 2731531bc2
[autoparallel] integrate device mesh initialization into autoparallelize (#2393)
* [autoparallel] integrate device mesh initialization into autoparallelize

* add megatron solution

* update gpt autoparallel examples with latest api

* adapt beta value to fit the current computation cost
2023-01-11 14:03:49 +08:00
Frank Lee c72c827e95
[cli] provided more details if colossalai run fail (#2442) 2023-01-11 13:56:42 +08:00
Super Daniel c41e59e5ad
[fx] allow native ckpt trace and codegen. (#2438) 2023-01-11 13:49:59 +08:00
YuliangLiu0306 41429b9b28
[autoparallel] add shard option (#2423) 2023-01-11 13:40:33 +08:00
Frank Lee 1b7587d958
[workflow] make test coverage report collapsable (#2436) 2023-01-11 13:37:48 +08:00
HELSON 7829aa094e
[ddp] add is_ddp_ignored (#2434)
[ddp] rename to is_ddp_ignored
2023-01-11 12:22:45 +08:00
Frank Lee a3e5496156
[example] improved the clarity yof the example readme (#2427)
* [example] improved the clarity yof the example readme

* polish workflow

* polish workflow

* polish workflow

* polish workflow

* polish workflow

* polish workflow
2023-01-11 10:46:32 +08:00
Frank Lee 21256674e9
[workflow] report test coverage even if below threshold (#2431) 2023-01-11 10:44:52 +08:00
HELSON bb4e9a311a
[zero] add inference mode and its unit test (#2418) 2023-01-11 10:07:37 +08:00