mirror of https://github.com/InternLM/InternLM
add openmind readme
parent
d679966d4d
commit
2958e62164
|
@ -37,7 +37,7 @@
|
|||
This is a guide to using Ascend NPU to train and infer the InternLM series models.
|
||||
|
||||
## News
|
||||
\[2025.01.15\] InternLM3-8B-Instruct can be used in Xtuner, LLaMA-Factory and transformers.
|
||||
\[2025.01.15\] InternLM3-8B-Instruct can be used in Xtuner, LLaMA-Factory, transformers and openMind.
|
||||
|
||||
## Model Zoo
|
||||
|
||||
|
@ -303,6 +303,69 @@ Execute the inference script:
|
|||
python inference_internlm3_instruct_8b.py
|
||||
```
|
||||
|
||||
## openMind Library
|
||||
|
||||
### Introduction to openMind
|
||||
|
||||
The openMind Library is an open-source suite for large-scale models, natively supporting fine-tuning, inference, evaluation, and deployment on Ascend NPUs. The openMind Library offers highly user-friendly interfaces and usage methods, fully leveraging the performance of Ascend NPUs to rapidly support and enhance cutting-edge industry models.
|
||||
|
||||
### Fine-Tuning
|
||||
|
||||
The openMind Library provides a one-click model fine-tuning solution on Ascend NPUs, encompassing capabilities such as data processing, multi-site weight loading, low-rank adaptation (LoRA), and quantization adaptation (QLoRA). Additionally, the openMind Library supports optimization of Ascend NPU fused operators, enhancing model training performance.
|
||||
|
||||
#### Installing the openMind Library
|
||||
|
||||
```shell
|
||||
git clone -b dev https://gitee.com/ascend/openmind.git
|
||||
cd openmind
|
||||
pip install -e .[pt]
|
||||
```
|
||||
|
||||
#### Initiating Fine-Tuning
|
||||
|
||||
Within the openMind directory, fine-tuning can be initiated using the following command line:
|
||||
|
||||
```
|
||||
openmind-cli train examples/internlm3/train_sft_full_internlm3.yaml
|
||||
```
|
||||
|
||||
#### Training Results and Advantages
|
||||
|
||||
As illustrated in the figure below, the training loss of the openMind Library normally converges, and compared with the GPU, the average relative error is within 2%.
|
||||
|
||||
<div align=center>
|
||||
<img src="./assets/openmind_train_loss_compare.png" width="600px">
|
||||
</div>
|
||||
|
||||
<p align="center"><strong>Accuracy Comparison</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
|
||||
|
||||
The openMind Library supports the enabling of fine-tuning methods such as LoRA and QLoRA on Ascend NPUs, significantly reducing device memory usage. As illustrated in the figure below, employing the QLoRA fine-tuning method can lead to approximately a 40% reduction in device memory consumption.
|
||||
|
||||
<div align=center>
|
||||
<img src="./assets/openmind_train_memory.png" width="400px">
|
||||
</div>
|
||||
|
||||
<p align="center"><strong>Memory Consumption</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
|
||||
|
||||
The openMind Library facilitates the automatic loading of Ascend NPU fused operators during training, eliminating the need for developers to manually modify code or configurations. This enhances model training performance while maintaining ease of use. The figure below demonstrates the performance benefits achieved by default when the openMind Library enables Ascend NPU fused operators.
|
||||
|
||||
<div align=center>
|
||||
<img src="./assets/openmind_fused_ops.png" width="300px">
|
||||
</div>
|
||||
|
||||
<p align="center"><strong>Training Samples per Second</strong></p>
|
||||
|
||||
For more features, please refer to the [openMind Fine-tuning Documentation](https://modelers.cn/docs/en/openmind-library/1.0.0/basic_tutorial/finetune/finetune_pt.html).
|
||||
|
||||
### Inference
|
||||
|
||||
In addition to fine-tuning, the openMind Library can also be utilized for model inference. After installing the openMind Library, a single round of inference can be conducted using the following command line:
|
||||
|
||||
```shell
|
||||
openmind-cli run Intern/internlm3-8b-instruct --task text-generation --input '{"text_inputs":"What is AI?","max_length":512}' --trust_remote_code 1
|
||||
```
|
||||
|
||||
For more features, please refer to the [openMind Inference Documentation](https://modelers.cn/docs/en/openmind-library/1.0.0/basic_tutorial/pipeline.html).
|
||||
|
||||
## License
|
||||
The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[申请表(中文)](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact <internlm@pjlab.org.cn>.
|
|
@ -37,7 +37,7 @@
|
|||
这是一份使用 Ascend NPU 对 InternLM 系列模型进行训练和推理的指南。
|
||||
|
||||
## News
|
||||
\[2025.01.15\] InternLM3-8B-Instruct 可用于 Xtuner、LLaMA-Factory 和 transformers 中。
|
||||
\[2025.01.15\] InternLM3-8B-Instruct 可用于 Xtuner、LLaMA-Factory、transformers 和 openMind 中。
|
||||
|
||||
## Model Zoo
|
||||
|
||||
|
@ -300,6 +300,75 @@ print(response)
|
|||
python inference_internlm3_instruct_8b.py
|
||||
```
|
||||
|
||||
## openMind Library
|
||||
|
||||
### openMind 简介
|
||||
|
||||
openMind Library 是一个开源的大模型套件,原生支持在昇腾NPU上进行微调、推理、评估和部署。
|
||||
openMind Library 提供高易用性的接口和使用方式,充分发挥昇腾NPU的性能,快速支持、增强业界前沿模型。
|
||||
|
||||
### 微调
|
||||
|
||||
openMind Library 提供了昇腾 NPU 上的一键式模型微调方案,涵盖了数据处理、多站点权重加载,低参微调(LoRA)、
|
||||
量化适配(QLoRA)等能力。同时,openMind Library支持昇腾NPU融合算子优化,提升模型训练性能。
|
||||
|
||||
#### 安装 openMind Library
|
||||
|
||||
```shell
|
||||
git clone -b dev https://gitee.com/ascend/openmind.git
|
||||
cd openmind
|
||||
pip install -e .[pt]
|
||||
```
|
||||
|
||||
#### 启动微调
|
||||
|
||||
在 openmind 文件夹下,通过以下命令行即可启动微调:
|
||||
|
||||
```
|
||||
openmind-cli train examples/internlm3/train_sft_full_internlm3.yaml
|
||||
```
|
||||
|
||||
#### 训练结果与优势
|
||||
|
||||
如下图所示,openMind Library 的训练 loss 正常收敛,同时和 GPU 对比,平均相对误差在 2% 以内。
|
||||
|
||||
<div align=center>
|
||||
<img src="./assets/openmind_train_loss_compare.png" width="600px">
|
||||
</div>
|
||||
|
||||
<p align="center"><strong>精度对比</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
|
||||
|
||||
openMind Library 支持在昇腾 NPU 上使能 LoRA、QLoRA 等微调方法,显著减少 device 内存使用。
|
||||
如下图所示,通过使能 QloRA 微调方式可减少 device 内存约 40%。
|
||||
|
||||
<div align=center>
|
||||
<img src="./assets/openmind_train_memory.png" width="400px">
|
||||
</div>
|
||||
|
||||
<p align="center"><strong>Full/LoRA/QLoRA 显存开销</strong> (npu=8, per_device_train_batch_size=6, max_length=1024)</p>
|
||||
|
||||
openMind Library 支持训练时自动加载昇腾 NPU 融合算子,无需开发者手动修改代码或配置,提升模型训练性能
|
||||
的同时兼顾易用性。下图展示了 openMind 默认使能昇腾 NPU 融合算子之后的性能收益。
|
||||
|
||||
<div align=center>
|
||||
<img src="./assets/openmind_fused_ops.png" width="300px">
|
||||
</div>
|
||||
|
||||
<p align="center"><strong>每秒训练样本数</strong></p>
|
||||
|
||||
更多特性请参考[openMind 微调文档](https://modelers.cn/docs/zh/openmind-library/1.0.0/basic_tutorial/finetune/finetune_pt.html)。
|
||||
|
||||
### 推理
|
||||
|
||||
除了微调以外,也可以使用 openMind Library 进行模型推理,安装 openMind Library 后,使用
|
||||
下述命令行即可进行单轮推理:
|
||||
|
||||
```shell
|
||||
openmind-cli run Intern/internlm3-8b-instruct --task text-generation --input '{"text_inputs":"What is AI?","max_length":512}' --trust_remote_code 1
|
||||
```
|
||||
|
||||
更多特性请参考[openMind 推理文档](https://modelers.cn/docs/zh/openmind-library/1.0.0/basic_tutorial/pipeline.html)。
|
||||
|
||||
## 开源许可证
|
||||
|
||||
本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放,也可申请免费的商业使用授权([申请表](https://wj.qq.com/s2/12725412/f7c1/))。其他问题与合作请联系 <internlm@pjlab.org.cn>。
|
||||
|
|
Binary file not shown.
After Width: | Height: | Size: 8.9 KiB |
Binary file not shown.
After Width: | Height: | Size: 212 KiB |
Binary file not shown.
After Width: | Height: | Size: 6.9 KiB |
Loading…
Reference in New Issue