InternLM/internlm/solver/optimizer
jiaopenglong 112c34ae09
feat(grad_norm): vocab grad norm profiling (#519)
* compute vocab grad norm && save pt

* add grad_norm profiling interval && refactor save grad norm

* fix ci test_pipeline
2023-12-06 13:52:42 +08:00
..
__init__.py feat(train): add fsdp training option (#293) 2023-10-09 18:59:31 +08:00
base_optimizer.py feat(train): add fsdp training option (#293) 2023-10-09 18:59:31 +08:00
fsdp_optimizer.py fix(optimizer/fsdp_optimizer.py): fsdp process empty params group (#408) 2023-10-10 20:06:04 +08:00
hybrid_zero_optim.py feat(grad_norm): vocab grad norm profiling (#519) 2023-12-06 13:52:42 +08:00
store.py feat(moe):support zero for expert local dp (#404) 2023-10-09 17:45:26 +08:00
utils.py feat(grad_norm): vocab grad norm profiling (#519) 2023-12-06 13:52:42 +08:00