2.0 KiB

Raw Blame History

Booster Checkpoint

Prerequisite:

Booster API

Introduction

We've introduced the Booster API in the previous tutorial. In this tutorial, we will introduce how to save and load checkpoints using booster.

Model Checkpoint

Model must be boosted by colossalai.booster.Booster before saving. checkpoint is the path to saved checkpoint. It can be a file, if shard=False. Otherwise, it should be a directory. If shard=True, the checkpoint will be saved in a sharded way. This is useful when the checkpoint is too large to be saved in a single file. Our sharded checkpoint format is compatible with huggingface/transformers.

Model must be boosted by colossalai.booster.Booster before loading. It will detect the checkpoint format automatically, and load in corresponding way.

Optimizer Checkpoint

⚠ Saving optimizer checkpoint in a sharded way is not supported yet.

Optimizer must be boosted by colossalai.booster.Booster before saving.

Optimizer must be boosted by colossalai.booster.Booster before loading.

LR Scheduler Checkpoint

LR scheduler must be boosted by colossalai.booster.Booster before saving. checkpoint is the local path to checkpoint file.

LR scheduler must be boosted by colossalai.booster.Booster before loading. checkpoint is the local path to checkpoint file.

Checkpoint design

More details about checkpoint design can be found in our discussion A Unified Checkpoint System Design.

2.0 KiB Raw Blame History