ColossalAI/examples/language
Wenhao Chen e614aa34f3
[shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508)
* feat: add `GradientCheckpointConfig` and `PipelineGradientCheckpointConfig`

* feat: apply `GradientCheckpointConfig` to policy and llama_forward

* feat: move `distribute_layer` and `get_stage_index` to PipelineStageManager

* fix: add optional args for `distribute_layer` and `get_stage_index`

* fix: fix changed API calls

* test: update llama tests

* style: polish `GradientCheckpointConfig`

* fix: fix pipeline utils tests
2024-04-01 11:34:58 +08:00
..
bert [hotfix] set return_outputs=False in examples and polish code (#5404) 2024-03-25 12:31:09 +08:00
commons [example] make gpt example directory more clear (#2353) 2023-01-06 11:11:26 +08:00
gpt [hotfix] set return_outputs=False in examples and polish code (#5404) 2024-03-25 12:31:09 +08:00
grok-1 [Fix] Grok-1 use tokenizer from the same pretrained path (#5532) 2024-03-28 16:30:04 +08:00
llama2 [hotfix] set return_outputs=False in examples and polish code (#5404) 2024-03-25 12:31:09 +08:00
openmoe [shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508) 2024-04-01 11:34:58 +08:00
opt [hotfix] set return_outputs=False in examples and polish code (#5404) 2024-03-25 12:31:09 +08:00
palm [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00
__init__.py [example]add gpt2 benchmark example script. (#5295) 2024-03-04 16:18:13 +08:00
data_utils.py [example]add gpt2 benchmark example script. (#5295) 2024-03-04 16:18:13 +08:00
model_utils.py [example]add gpt2 benchmark example script. (#5295) 2024-03-04 16:18:13 +08:00
performance_evaluator.py [example]add gpt2 benchmark example script. (#5295) 2024-03-04 16:18:13 +08:00