ColossalAI/docs/config.md

# Config file

Here is a config file example showing how to train a ViT model on the CIFAR10 dataset using Colossal-AI:

```python
# optional
# three keys: pipeline, tensor
# data parallel size is inferred
parallel = dict(
    pipeline=dict(size=1),
    tensor=dict(size=4, mode='2d'),
)

# optional
# pipeline or no pipeline schedule
fp16 = dict(
    mode=AMP_TYPE.NAIVE,
    initial_scale=2 ** 8
)

# optional
# configuration for zero
# you can refer to the Zero Redundancy optimizer and zero offload section for details
# https://www.colossalai.org/zero.html
zero = dict(
    level=<int>,
    ...
)

# optional
# if you are using complex gradient handling
# otherwise, you do not need this in your config file
# default gradient_handlers = None
gradient_handlers = [dict(type='MyHandler', arg1=1, arg=2), ...]

# optional
# specific gradient accumulation size
# if your batch size is not large enough
gradient_accumulation = <int>

# optional
# add gradient clipping to your engine
# this config is not compatible with zero and AMP_TYPE.NAIVE
# but works with AMP_TYPE.TORCH and AMP_TYPE.APEX
# defautl clip_grad_norm = 0.0
clip_grad_norm = <float>

# optional
# cudnn setting
# default is like below
cudnn_benchmark = False,
cudnn_deterministic=True,

```
Migrated project 3 years ago			`# Config file`

fixed some typos in the documents, added blog link and paper author information in README 3 years ago			`Here is a config file example showing how to train a ViT model on the CIFAR10 dataset using Colossal-AI:`
Migrated project 3 years ago
			```python
update markdown docs (english) (#60) 3 years ago			`# optional`
			`# three keys: pipeline, tensor`
			`# data parallel size is inferred`
Migrated project 3 years ago			`parallel = dict(`
			`pipeline=dict(size=1),`
			`tensor=dict(size=4, mode='2d'),`
			`)`

update markdown docs (english) (#60) 3 years ago			`# optional`
Migrated project 3 years ago			`# pipeline or no pipeline schedule`
			`fp16 = dict(`
update markdown docs (english) (#60) 3 years ago			`mode=AMP_TYPE.NAIVE,`
Migrated project 3 years ago			`initial_scale=2 ** 8`
			`)`

update examples and sphnix docs for the new api (#63) 3 years ago			`# optional`
			`# configuration for zero`
			`# you can refer to the Zero Redundancy optimizer and zero offload section for details`
			`# https://www.colossalai.org/zero.html`
			`zero = dict(`
			`level=<int>,`
			`...`
			`)`

update markdown docs (english) (#60) 3 years ago			`# optional`
			`# if you are using complex gradient handling`
			`# otherwise, you do not need this in your config file`
			`# default gradient_handlers = None`
			`gradient_handlers = [dict(type='MyHandler', arg1=1, arg=2), ...]`
Migrated project 3 years ago
update markdown docs (english) (#60) 3 years ago			`# optional`
			`# specific gradient accumulation size`
			`# if your batch size is not large enough`
			`gradient_accumulation = <int>`
Migrated project 3 years ago
update markdown docs (english) (#60) 3 years ago			`# optional`
			`# add gradient clipping to your engine`
			`# this config is not compatible with zero and AMP_TYPE.NAIVE`
			`# but works with AMP_TYPE.TORCH and AMP_TYPE.APEX`
			`# defautl clip_grad_norm = 0.0`
			`clip_grad_norm = <float>`

			`# optional`
			`# cudnn setting`
			`# default is like below`
			`cudnn_benchmark = False,`
			`cudnn_deterministic=True,`
Migrated project 3 years ago
			```