[doc] add tutorial for cluster utils (#3763)

* [doc] add en cluster utils doc

* [doc] add zh cluster utils doc

* [doc] add cluster utils doc in sidebar
pull/3780/head^2
Hongxin Liu 2023-05-19 12:12:20 +08:00 committed by GitHub
parent 5452df63c5
commit 5ce6c9d86f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 66 additions and 1 deletions

View File

@ -58,7 +58,8 @@
] ]
}, },
"features/pipeline_parallel", "features/pipeline_parallel",
"features/nvme_offload" "features/nvme_offload",
"features/cluster_utils"
] ]
}, },
{ {

View File

@ -0,0 +1,32 @@
# Cluster Utilities
Author: [Hongxin Liu](https://github.com/ver217)
**Prerequisite:**
- [Distributed Training](../concepts/distributed_training.md)
## Introduction
We provide a utility class `colossalai.cluster.DistCoordinator` to coordinate distributed training. It's useful to get various information about the cluster, such as the number of nodes, the number of processes per node, etc.
## API Reference
{{ autodoc:colossalai.cluster.DistCoordinator }}
{{ autodoc:colossalai.cluster.DistCoordinator.is_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.is_node_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.is_last_process }}
{{ autodoc:colossalai.cluster.DistCoordinator.print_on_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.print_on_node_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.priority_execution }}
{{ autodoc:colossalai.cluster.DistCoordinator.destroy }}
{{ autodoc:colossalai.cluster.DistCoordinator.block_all }}
{{ autodoc:colossalai.cluster.DistCoordinator.on_master_only }}

View File

@ -0,0 +1,32 @@
# 集群实用程序
作者: [Hongxin Liu](https://github.com/ver217)
**前置教程:**
- [分布式训练](../concepts/distributed_training.md)
## 引言
我们提供了一个实用程序类 `colossalai.cluster.DistCoordinator` 来协调分布式训练。它对于获取有关集群的各种信息很有用,例如节点数、每个节点的进程数等。
## API 参考
{{ autodoc:colossalai.cluster.DistCoordinator }}
{{ autodoc:colossalai.cluster.DistCoordinator.is_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.is_node_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.is_last_process }}
{{ autodoc:colossalai.cluster.DistCoordinator.print_on_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.print_on_node_master }}
{{ autodoc:colossalai.cluster.DistCoordinator.priority_execution }}
{{ autodoc:colossalai.cluster.DistCoordinator.destroy }}
{{ autodoc:colossalai.cluster.DistCoordinator.block_all }}
{{ autodoc:colossalai.cluster.DistCoordinator.on_master_only }}