InternLM/internlm/utils
Guoteng f6e007f95b
feat(ckpt): fix checkpoint bugs and add feature enhancements. (#259)
* fix(ckpt): ckpt bug fix and api refactor
1. fix latest ckpt query bug
2. add ckpt unit test
3. fix storage manager boto3/local client get_fns bug
4. fix only model load case zero fp32 buffer overwrite model weights bug.
5. add ckpt_type and add zero reload ci-test

* fix(ckpt): fix ckpt and trainer bug

* fix and refactor

* fix base on comment

* feat: add legacy api
2023-09-05 17:40:48 +08:00
..
__init__.py initial commit 2023-07-06 12:55:23 +08:00
checkpoint.py initial commit 2023-07-06 12:55:23 +08:00
common.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
evaluation.py fix(eval): StreamingDataset does not have an __len__ method. (#251) 2023-08-31 15:29:04 +08:00
gputest.py Feat/add runntime gpu test (#254) 2023-09-01 13:38:01 +08:00
logger.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
megatron_timers.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
model_checkpoint.py feat(ckpt): fix checkpoint bugs and add feature enhancements. (#259) 2023-09-05 17:40:48 +08:00
parallel.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
registry.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
simple_memory_profiler.py Merge develop to main (#233) 2023-08-24 22:03:04 +08:00
storage_manager.py feat(ckpt): fix checkpoint bugs and add feature enhancements. (#259) 2023-09-05 17:40:48 +08:00
timeout.py [feat]: add pal reasoning script (#163) 2023-08-10 17:53:46 +08:00
writer.py feat(utils/writer.py): support writer add_scalars for writing dict data (#257) 2023-09-01 13:24:46 +08:00