Merge bc77d13333 into fe4c2eb273

2025-07-15 10:13:38 +08:00 · 2025-07-15 10:13:38 +08:00 · 139c96141d
parent fe4c2eb273 bc77d13333
commit 139c96141d
2 changed files with 62 additions and 16 deletions
--- a/ecosystem/README_npu.md
+++ b/ecosystem/README_npu.md
@ -72,18 +72,39 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
 ### Installing Xtuner
 ```shell
-git clone https://github.com/InternLM/xtuner.git
+git clone -b v0.2.0rc0 https://github.com/InternLM/xtuner.git
 cd xtuner
 ```
-Modify `requirements/runtime.txt` with the following changes:
+Modify `requirements/runtime.txt` to the following dependencies:
 ```text
-bitsandbytes==0.42.0
+pyyaml
-torchvision==0.19.0
+datasets>=3.2.0
 einops
 loguru
 mmengine==0.10.6
 openpyxl
 peft>=0.14.0
 scikit-image
 scipy
 SentencePiece
 tiktoken
 torch==2.6
 torchvision==0.21.0
 transformers==4.48.0
 transformers_stream_generator
 decorator
 numpy==1.26.4
 ```
 Modify `requirements/deepspeed.txt` to the following dependencies:
 ```text
 deepspeed==0.16.2
 mpi4py
 ```
 Use the following command for installation:
 ```shell
@ -92,7 +113,7 @@ pip install -e '.[all]'
 **Note**:
- The default installation version of `torch` is the latest version. Please pay attention to match it with the version of `torch_npu`.
+- Please note that the versions of `torch`, `torch_npu` and `torchvision` must match.
 ### LoRA Fine-tuning
@ -103,10 +124,12 @@ xtuner copy-cfg internlm2_5_chat_7b_qlora_oasst1_e3 .
 mv internlm2_5_chat_7b_qlora_oasst1_e3_copy.py internlm3_8b_instruct_lora_oasst1_e10.py
 ```
-The modifications to the configuration file `internlm3_8b_instruct_lora_oasst1_e10.py` are as follows:
+Download the model weights `internlm3-8b-instruction` and dataset `oasst1` by yourself，the modifications to the configuration file `internlm3_8b_instruct_lora_oasst1_e10.py` are as follows:
 ```python
-pretrained_model_name_or_path = 'internlm/internlm3-8b-instruct'
+pretrained_model_name_or_path = 'path/to/internlm3-8b-instruct'
 data_path = 'path/to/oasst1'
 max_epochs = 10
@ -137,7 +160,7 @@ Run the following commands to start single-machine eight-card fine-tuning:
 NPROC_PER_NODE=8 xtuner train internlm3_8b_instruct_lora_oasst1_e10.py --deepspeed deepspeed_zero2
 ```
-The fine-tuning results are saved in the directory `./work_dirs/internlm3_8b_instruct_lora_oasst1_e10/iter_xxx.pth`.
+The fine-tuning results are saved in the directory `./work_dirs/internlm3_8b_instruct_lora_oasst1_e10`.
 The comparison of loss between NPU and GPU is as follows:
 ![xtuner_training_loss](../assets/npu/xtuner_training_loss_compare.png)
--- a/ecosystem/README_npu_zh-CN.md
+++ b/ecosystem/README_npu_zh-CN.md
@ -72,18 +72,39 @@ source /usr/local/Ascend/ascend-toolkit/set_env.sh
 ### 安装 Xtuner
 ```shell
-git clone https://github.com/InternLM/xtuner.git
+git clone -b v0.2.0rc0 https://github.com/InternLM/xtuner.git
 cd xtuner
 ```
-修改`requirements/runtime.txt`，修改点如下：
+修改`requirements/runtime.txt`，修改为如下依赖：
 ```text
-bitsandbytes==0.42.0
+pyyaml
-torchvision==0.19.0
+datasets>=3.2.0
 einops
 loguru
 mmengine==0.10.6
 openpyxl
 peft>=0.14.0
 scikit-image
 scipy
 SentencePiece
 tiktoken
 torch==2.6
 torchvision==0.21.0
 transformers==4.48.0
 transformers_stream_generator
 decorator
 numpy==1.26.4
 ```
 修改`requirements/deepspeed.txt`,修改为如下依赖：
 ```text
 deepspeed==0.16.2
 mpi4py
 ```
 使用以下命令进行安装：
 ```shell
@ -92,7 +113,7 @@ pip install -e '.[all]'
 **注意**:
- 默认安装`torch`为最新版，请注意与`torch_npu`版本相匹配
+- 请注意`torch`与`torch_npu`、`torchvision`版本要相匹配
 ### LoRA 微调
@ -103,10 +124,12 @@ xtuner copy-cfg internlm2_5_chat_7b_qlora_oasst1_e3 .
 mv internlm2_5_chat_7b_qlora_oasst1_e3_copy.py internlm3_8b_instruct_lora_oasst1_e10.py
 ```
-`internlm3_8b_instruct_lora_oasst1_e10.py`配置文件的修改点如下：
+自行下载模型权重`internlm3-8b-instruct`以及数据集`oasst1`，`internlm3_8b_instruct_lora_oasst1_e10.py`配置文件的修改点如下：
 ```python
-pretrained_model_name_or_path = 'internlm/internlm3-8b-instruct'
+pretrained_model_name_or_path = 'path/to/internlm3-8b-instruct'
 data_path = 'path/to/oasst1'
 max_epochs = 10
@ -137,7 +160,7 @@ randomness = dict(seed=123, deterministic=True)
 NPROC_PER_NODE=8 xtuner train internlm3_8b_instruct_lora_oasst1_e10.py --deepspeed deepspeed_zero2
 ```
-微调后结果保存在`./work_dirs/internlm3_8b_instruct_lora_oasst1_e10/iter_xxx.pth`，NPU与GPU的loss对比如下：
+微调后结果保存在`./work_dirs/internlm3_8b_instruct_lora_oasst1_e10/`，NPU与GPU的loss对比如下：
 ![xtuner_training_loss](../assets/npu/xtuner_training_loss_compare.png)