InternLM

History

zaglc a075153adf feat(train): add fsdp training option (#293 ) * feat(fsdp): add training option for fsdp * fix(fsdp): add mix-precision training * fix failure in lint-check * fix format problem * restore 7B_sft * fix load ckpt bug * fix load ckpt bug2 * feat(solver/optimizer): add new file fsdp_optimizer.py * fix(train.py): fix ci lint error * fix(fsdp_optimizer.py): wait grad async * fix bug for loading ckpts when zero1 < dp_size * fix(context/parallel_context.py): only log warning for fsdp * change ckpt name * fix(model/modeling_internlm.py): fix checkpoint=False runtime error * more wrap * add support for FSDP with tp * modify args_sanity_check for fsdp with pipeline and fsdp with moe * fix(internlm/utils/parallel.py): fix circular import * fix(internlm/train/training_internlm.py): remove set IS_TENSOR_PARALLEL attr * fix(internlm/train/training_internlm.py): update wrap class and fix lint error * fix(internlm/model): reset dropout_selective_checkpoint=True * feat(configs/7B_sft.py): move fsdp config to parallel zero1 * feat(configs/7B_sft.py): adapt to old version config --------- Co-authored-by: huangting4201 <1538303371@qq.com>		2023-10-09 18:59:31 +08:00
..
legacy	feat(ckpt): fix checkpoint bugs and add feature enhancements. (#259 )	2023-09-05 17:40:48 +08:00
__init__.py	feat(numa): bind numa if possible (#320 )	2023-09-25 19:34:52 +08:00
initialize_tensor.py	feat(model): implement uniform_init for tensor. (#252 )	2023-09-01 01:12:53 +08:00
initialize_trainer.py	docs(*): add documentation and reST files for readthedocs (#272 )	2023-09-06 15:36:03 +08:00
launch.py	feat(train): add fsdp training option (#293 )	2023-10-09 18:59:31 +08:00