ColossalAI/colossalai/nn
HELSON a9b8300d54
[zero] improve adaptability for not-shard parameters (#708)
* adapt post grad hooks for not-shard parameters
* adapt optimizer for not-shard parameters
* offload gradients for not-replicated parameters
2022-04-11 13:38:51 +08:00
..
layer [zero] improve adaptability for not-shard parameters (#708) 2022-04-11 13:38:51 +08:00
loss [hotfix] Raise messages for indivisible batch sizes with tensor parallelism (#622) 2022-04-02 16:12:04 +08:00
lr_scheduler Refactored docstring to google style 2022-03-29 17:17:47 +08:00
metric [hotfix] Raise messages for indivisible batch sizes with tensor parallelism (#622) 2022-04-02 16:12:04 +08:00
model Develop/experiments (#59) 2021-12-09 15:08:29 +08:00
optimizer [zero] improve adaptability for not-shard parameters (#708) 2022-04-11 13:38:51 +08:00
__init__.py Layer integration (#83) 2021-12-27 15:04:32 +08:00
init.py Refactored docstring to google style 2022-03-29 17:17:47 +08:00