diff --git a/doc/en/train_performance.md b/doc/en/train_performance.md index 9c77d9e..b24b7e8 100644 --- a/doc/en/train_performance.md +++ b/doc/en/train_performance.md @@ -91,3 +91,15 @@ When `Activation Ckpt` is turned off, the test results are as shown in the table +### Extreme GPU Number Testing +#### Minimum number of 80GB GPU required for training +| Model | GPU num | zero | tp | pp | GPU memory (GB) | +|-|-|-|-|-|-| +| 7B | 3 | -1 | 1 | 3 | 75 | +| 20B | 8 | -1 | 8 | 1 | 64 | +#### Minimum number of 40GB GPU required for training +| Model | GPU num | zero | tp | pp | GPU memory (GB) | +|-|-|-|-|-|-| +| 7B | 6 | -1 | 2 | 1 | 40 | +| 20B | 16 | -1 | 8 | 1 | 38 | + diff --git a/doc/train_performance.md b/doc/train_performance.md index 239e20f..5b03e3d 100644 --- a/doc/train_performance.md +++ b/doc/train_performance.md @@ -88,3 +88,14 @@ InternLM中`zero1`的配置决定了优化器状态的分配范围。 +### 极限卡数测试 +#### 80GB显卡训练所需最少卡数 +| 模型 | 卡数 | zero | tp | pp | 显存(GB) | +|-|-|-|-|-|-| +| 7B | 3 | -1 | 1 | 3 | 75 | +| 20B | 8 | -1 | 8 | 1 | 64 | +#### 40GB显卡训练所需最少卡数 +| 模型 | 卡数 | zero | tp | pp | 显存(GB) | +|-|-|-|-|-|-| +| 7B | 6 | -1 | 2 | 1 | 40 | +| 20B | 16 | -1 | 8 | 1 | 38 | \ No newline at end of file