add openmind readme

pull/816/head
baymax591 2025-01-17 14:43:35 +08:00
parent d679966d4d
commit 2958e62164
5 changed files with 134 additions and 2 deletions

View File

@ -37,7 +37,7 @@
This is a guide to using Ascend NPU to train and infer the InternLM series models.
## News
\[2025.01.15\] InternLM3-8B-Instruct can be used in Xtuner, LLaMA-Factory and transformers.
\[2025.01.15\] InternLM3-8B-Instruct can be used in Xtuner, LLaMA-Factory, transformers and openMind.
## Model Zoo
@ -303,6 +303,69 @@ Execute the inference script:
python inference_internlm3_instruct_8b.py
```
## openMind Library
### Introduction to openMind
The openMind Library is an open-source suite for large-scale models, natively supporting fine-tuning, inference, evaluation, and deployment on Ascend NPUs. The openMind Library offers highly user-friendly interfaces and usage methods, fully leveraging the performance of Ascend NPUs to rapidly support and enhance cutting-edge industry models.
### Fine-Tuning
The openMind Library provides a one-click model fine-tuning solution on Ascend NPUs, encompassing capabilities such as data processing, multi-site weight loading, low-rank adaptation (LoRA), and quantization adaptation (QLoRA). Additionally, the openMind Library supports optimization of Ascend NPU fused operators, enhancing model training performance.
#### Installing the openMind Library
```shell
git clone -b dev https://gitee.com/ascend/openmind.git
cd openmind
pip install -e .[pt]
```
#### Initiating Fine-Tuning
Within the openMind directory, fine-tuning can be initiated using the following command line:
```
openmind-cli train examples/internlm3/train_sft_full_internlm3.yaml
```
#### Training Results and Advantages
As illustrated in the figure below, the training loss of the openMind Library normally converges, and compared with the GPU, the average relative error is within 2%.
<div align=center>
<img src="./assets/openmind_train_loss_compare.png" width="600px">
</div>
<p align="center"><strong>Accuracy Comparison</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
The openMind Library supports the enabling of fine-tuning methods such as LoRA and QLoRA on Ascend NPUs, significantly reducing device memory usage. As illustrated in the figure below, employing the QLoRA fine-tuning method can lead to approximately a 40% reduction in device memory consumption.
<div align=center>
<img src="./assets/openmind_train_memory.png" width="400px">
</div>
<p align="center"><strong>Memory Consumption</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
The openMind Library facilitates the automatic loading of Ascend NPU fused operators during training, eliminating the need for developers to manually modify code or configurations. This enhances model training performance while maintaining ease of use. The figure below demonstrates the performance benefits achieved by default when the openMind Library enables Ascend NPU fused operators.
<div align=center>
<img src="./assets/openmind_fused_ops.png" width="300px">
</div>
<p align="center"><strong>Training Samples per Second</strong></p>
For more features, please refer to the [openMind Fine-tuning Documentation](https://modelers.cn/docs/en/openmind-library/1.0.0/basic_tutorial/finetune/finetune_pt.html).
### Inference
In addition to fine-tuning, the openMind Library can also be utilized for model inference. After installing the openMind Library, a single round of inference can be conducted using the following command line:
```shell
openmind-cli run Intern/internlm3-8b-instruct --task text-generation --input '{"text_inputs":"What is AI?","max_length":512}' --trust_remote_code 1
```
For more features, please refer to the [openMind Inference Documentation](https://modelers.cn/docs/en/openmind-library/1.0.0/basic_tutorial/pipeline.html).
## License
The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[申请表(中文)](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact <internlm@pjlab.org.cn>.

View File

@ -37,7 +37,7 @@
这是一份使用 Ascend NPU 对 InternLM 系列模型进行训练和推理的指南。
## News
\[2025.01.15\] InternLM3-8B-Instruct 可用于 Xtuner、LLaMA-Factory 和 transformers 中。
\[2025.01.15\] InternLM3-8B-Instruct 可用于 Xtuner、LLaMA-Factory、transformers 和 openMind 中。
## Model Zoo
@ -300,6 +300,75 @@ print(response)
python inference_internlm3_instruct_8b.py
```
## openMind Library
### openMind 简介
openMind Library 是一个开源的大模型套件原生支持在昇腾NPU上进行微调、推理、评估和部署。
openMind Library 提供高易用性的接口和使用方式充分发挥昇腾NPU的性能快速支持、增强业界前沿模型。
### 微调
openMind Library 提供了昇腾 NPU 上的一键式模型微调方案涵盖了数据处理、多站点权重加载低参微调LoRA
量化适配QLoRA等能力。同时openMind Library支持昇腾NPU融合算子优化提升模型训练性能。
#### 安装 openMind Library
```shell
git clone -b dev https://gitee.com/ascend/openmind.git
cd openmind
pip install -e .[pt]
```
#### 启动微调
在 openmind 文件夹下,通过以下命令行即可启动微调:
```
openmind-cli train examples/internlm3/train_sft_full_internlm3.yaml
```
#### 训练结果与优势
如下图所示openMind Library 的训练 loss 正常收敛,同时和 GPU 对比,平均相对误差在 2% 以内。
<div align=center>
<img src="./assets/openmind_train_loss_compare.png" width="600px">
</div>
<p align="center"><strong>精度对比</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
openMind Library 支持在昇腾 NPU 上使能 LoRA、QLoRA 等微调方法,显著减少 device 内存使用。
如下图所示,通过使能 QloRA 微调方式可减少 device 内存约 40%。
<div align=center>
<img src="./assets/openmind_train_memory.png" width="400px">
</div>
<p align="center"><strong>Full/LoRA/QLoRA 显存开销</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
openMind Library 支持训练时自动加载昇腾 NPU 融合算子,无需开发者手动修改代码或配置,提升模型训练性能
的同时兼顾易用性。下图展示了 openMind 默认使能昇腾 NPU 融合算子之后的性能收益。
<div align=center>
<img src="./assets/openmind_fused_ops.png" width="300px">
</div>
<p align="center"><strong>每秒训练样本数</strong></p>
更多特性请参考[openMind 微调文档](https://modelers.cn/docs/zh/openmind-library/1.0.0/basic_tutorial/finetune/finetune_pt.html)。
### 推理
除了微调以外,也可以使用 openMind Library 进行模型推理,安装 openMind Library 后,使用
下述命令行即可进行单轮推理:
```shell
openmind-cli run Intern/internlm3-8b-instruct --task text-generation --input '{"text_inputs":"What is AI?","max_length":512}' --trust_remote_code 1
```
更多特性请参考[openMind 推理文档](https://modelers.cn/docs/zh/openmind-library/1.0.0/basic_tutorial/pipeline.html)。
## 开源许可证
本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放,也可申请免费的商业使用授权([申请表](https://wj.qq.com/s2/12725412/f7c1/))。其他问题与合作请联系 <internlm@pjlab.org.cn>

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 212 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.9 KiB