ColossalAI/colossalai
Baizhou Zhang 083d7da33d [pipeline] add pipeline support for all T5 models (#4310)
* complete policy for T5Model & T5ForConditionalGeneration

* modify function signature in forwards

* add forward for T5model

* add forward for T5ForConditionalGeneration

* fix a bug

* fix hidden_states transporting in decoder

* fix the passing of encoder_outputs
2023-08-15 23:25:14 +08:00
..
_C [setup] support pre-build and jit-build of cuda kernels (#2374) 2023-01-06 20:50:26 +08:00
_analyzer [example] add train resnet/vit with booster example (#3694) 2023-05-08 10:42:30 +08:00
amp [bf16] add bf16 support (#3882) 2023-06-05 15:58:31 +08:00
auto_parallel [NFC] polish runtime_preparation_pass style (#4266) 2023-07-26 14:12:57 +08:00
autochunk fix typo colossalai/auto_parallel autochunk fx/passes etc. (#3808) 2023-05-24 09:01:50 +08:00
booster [zero] support shard optimizer state dict of zero (#4194) 2023-07-31 22:13:29 +08:00
builder [NFC] polish colossalai/builder/__init__.py code style (#1560) 2022-09-08 22:11:04 +08:00
checkpoint_io [checkpointio] Sharded Optimizer Checkpoint for Gemini Plugin (#4302) 2023-07-21 14:39:01 +08:00
cli fix localhost measurement (#4320) 2023-08-01 10:14:00 +08:00
cluster [cluster] add process group mesh (#4039) 2023-08-15 23:25:14 +08:00
communication [NFC] fix: format (#4270) 2023-07-26 14:12:57 +08:00
context [CI] fix some spelling errors (#3707) 2023-05-10 17:12:03 +08:00
device [format] applied code formatting on changed files in pull request 4152 (#4157) 2023-07-04 16:07:47 +08:00
engine [nfc]fix ColossalaiOptimizer is not defined (#4122) 2023-06-30 17:23:22 +08:00
fx [nfc] fix typo colossalai/cli fx kernel (#3847) 2023-06-02 15:02:45 +08:00
interface [pipeline] refactor 1f1b schedule (#4115) 2023-08-15 23:25:14 +08:00
kernel [coloattention] fix import error (#4380) 2023-08-04 16:28:41 +08:00
lazy [shardformer] support lazy init (#4202) 2023-08-15 23:25:14 +08:00
logging [logger] hotfix, missing _FORMAT (#2231) 2022-12-29 22:59:39 +08:00
nn [doc] add Series A Funding and NeurIPS news (#4377) 2023-08-04 17:42:07 +08:00
pipeline [pipeline] test pure pipeline process using llama (#4218) 2023-08-15 23:25:14 +08:00
registry Remove duplication registry (#1078) 2022-06-08 07:47:24 +08:00
shardformer [pipeline] add pipeline support for all T5 models (#4310) 2023-08-15 23:25:14 +08:00
tensor [shardformer] support inplace sharding (#4251) 2023-08-15 23:25:14 +08:00
testing Next commit [checkpointio] Unsharded Optimizer Checkpoint for Gemini Plugin (#4141) 2023-07-07 16:33:06 +08:00
trainer fix typo with colossalai/trainer utils zero (#3908) 2023-06-07 16:08:37 +08:00
utils [test] remove useless tests (#4359) 2023-08-01 18:52:14 +08:00
zero [hotfix] fix unsafe async comm in zero (#4404) 2023-08-11 15:09:24 +08:00
__init__.py [setup] supported conda-installed torch (#2048) 2022-11-30 16:45:15 +08:00
constants.py updated tp layers 2022-11-02 12:19:38 +08:00
core.py [Tensor] distributed view supports inter-process hybrid parallel (#1169) 2022-06-27 09:45:26 +08:00
global_variables.py [NFC] polish colossalai/global_variables.py code style (#3259) 2023-03-29 15:22:21 +08:00
initialize.py [nfc] fix typo colossalai/zero (#3923) 2023-06-08 00:01:29 +08:00