ColossalAI/colossalai/utils
HELSON e6d50ec107
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero

* add unitest for moe-zero model init

* polish moe gradient handler
2022-03-31 18:34:11 +08:00
..
data_sampler Refactored docstring to google style 2022-03-29 17:17:47 +08:00
gradient_accumulation Refactored docstring to google style 2022-03-29 17:17:47 +08:00
memory_tracer [zero] adapt zero for unsharded parameters (#561) 2022-03-31 18:34:11 +08:00
memory_utils [zero] trace states of fp16/32 grad and fp32 param (#571) 2022-03-31 16:26:54 +08:00
multi_tensor_apply Refactored docstring to google style 2022-03-29 17:17:47 +08:00
profiler html refactor (#555) 2022-03-31 11:36:56 +08:00
tensor_detector Refactored docstring to google style 2022-03-29 17:17:47 +08:00
__init__.py [memory] add model data tensor moving api (#503) 2022-03-24 14:29:41 +08:00
activation_checkpoint.py Refactored docstring to google style 2022-03-29 17:17:47 +08:00
checkpointing.py Refactored docstring to google style 2022-03-29 17:17:47 +08:00
common.py Refactored docstring to google style 2022-03-29 17:17:47 +08:00
cuda.py Fixed docstring in colossalai (#171) 2022-01-21 10:44:30 +08:00
moe.py Refactored docstring to google style 2022-03-29 17:17:47 +08:00
timer.py Refactored docstring to google style 2022-03-29 17:17:47 +08:00