diff --git a/doc/en/train_performance.md b/doc/en/train_performance.md
index 9c77d9e..b24b7e8 100644
--- a/doc/en/train_performance.md
+++ b/doc/en/train_performance.md
@@ -91,3 +91,15 @@ When `Activation Ckpt` is turned off, the test results are as shown in the table
     <img src="../imgs/flops.png" width="580"/>
 </div>
 
+### Extreme GPU Number Testing
+#### Minimum number of 80GB GPU required for training
+| Model | GPU num | zero | tp | pp | GPU memory (GB) |
+|-|-|-|-|-|-|
+| 7B | 3 | -1 | 1 | 3 | 75 |
+| 20B | 8 | -1 | 8 | 1 | 64 |
+#### Minimum number of 40GB GPU required for training
+| Model | GPU num | zero | tp | pp | GPU memory (GB) |
+|-|-|-|-|-|-|
+| 7B | 6 | -1 | 2 | 1 | 40 |
+| 20B | 16 | -1 | 8 | 1 | 38 |
+
diff --git a/doc/train_performance.md b/doc/train_performance.md
index 239e20f..5b03e3d 100644
--- a/doc/train_performance.md
+++ b/doc/train_performance.md
@@ -88,3 +88,14 @@ InternLM中`zero1`的配置决定了优化器状态的分配范围。
     <img src="../doc/imgs/flops.png" width="580"/>
 </div>
 
+### 极限卡数测试
+#### 80GB显卡训练所需最少卡数
+| 模型 | 卡数 | zero | tp | pp | 显存(GB) |
+|-|-|-|-|-|-|
+| 7B | 3 | -1 | 1 | 3 | 75 |
+| 20B | 8 | -1 | 8 | 1 | 64 |
+#### 40GB显卡训练所需最少卡数
+| 模型 | 卡数 | zero | tp | pp | 显存(GB) |
+|-|-|-|-|-|-|
+| 7B | 6 | -1 | 2 | 1 | 40 |
+| 20B | 16 | -1 | 8 | 1 | 38 |
\ No newline at end of file