ColossalAI/colossalai/zero
ver217 fce9432f08 sync before creating empty grad 2022-03-16 14:24:09 +08:00
..
init_ctx use double buffer to handle grad 2022-03-16 14:24:09 +08:00
shard_utils polish code 2022-03-14 15:48:55 +08:00
sharded_model sync before creating empty grad 2022-03-16 14:24:09 +08:00
sharded_optim [zero] refactory ShardedOptimV2 init method (#416) 2022-03-15 10:45:55 +08:00
sharded_param use double buffer to handle grad 2022-03-16 14:24:09 +08:00
__init__.py added buffer sync to naive amp model wrapper (#291) 2022-03-11 15:50:28 +08:00