Fix readme about conversion to transformers (#25)

* add links for 8k

* fix acknowledgement

* modified readme for convert_hf
pull/30/head
yhcc 2023-07-07 13:38:06 +08:00 committed by GitHub
parent ed04c7edb0
commit 745d2b911a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 38 additions and 10 deletions

View File

@ -3,8 +3,8 @@
├── transformers # 适配hugging face的transformers的一些工具
│ ├── configuration_internlm.py # config适配工具
│ ├── modeling_internlm.py # model适配工具
── tokenization_internlm.py # tokenizer适配工具
── convert2hf.py # 模型适配hugging face工具
── tokenization_internlm.py # tokenizer适配工具
│ └── convert2hf.py # 模型适配hugging face工具
└── tokenizer.py # 将原始数据转换成bin和meta文件的工具
```

View File

@ -4,7 +4,7 @@ This directory provide some tools for model training with the following file str
│ ├── configuration_internlm.py # tools for adapting config
│ ├── modeling_internlm.py # tools for adapting model
│ └── tokenization_internlm.py # tools for adapting tokenizer
── convert2hf.py # tools for adapting models to Hugging Face's format
│ └── convert2hf.py # tools for adapting models to Hugging Face's format
└── tokenizer.py # tools for generating `bin` and `meta` file for raw data
```

View File

@ -0,0 +1,26 @@
# InternLM Transformers
[English](./README.md) |
[简体中文](./README-zh-Hans.md)
该文件夹下包含了 transformers 格式的 `InternLM` 模型。
## 权重转换
`convert2hf.py` 可以将训练保存的权重一键转换为 transformers 格式。
```bash
python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer ../v7_sft.model
```
然后可以使用 `from_pretrained` 接口加载:
```python
from modeling_internlm import InternLMForCausalLM
model = InternForCausalLM.from_pretrained("hf_ckpt/")
```
`intern_moss_example.py` 展示了如何使用 LoRA 来在 `fnlp/moss-moon-002-sft` 数据集上进行微调的样例。

View File

@ -1,16 +1,19 @@
# InternLM Transformers
该文件夹下包含了 transformers 格式的 `InternLM` 模型。
[English](./README.md) |
[简体中文](./README-zh-Hans.md)
## 权重转换
This folder contains the `InternLM` model in transformers format.
`../tools/convert2hf.py` 可以将训练保存的权重一键转换为 transformers 格式。
## Weight Conversion
`convert2hf.py` can convert saved training weights into the transformers format with a single command.
```bash
python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer tokenizes/tokenizer.model
python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer ../v7_sft.model
```
然后可以使用 `from_pretrained` 接口加载:
Then, you can load it using the `from_pretrained` interface:
```python
from modeling_internlm import InternLMForCausalLM
@ -18,5 +21,4 @@ from modeling_internlm import InternLMForCausalLM
model = InternForCausalLM.from_pretrained("hf_ckpt/")
```
`moss_example.py` 展示了如何使用 LoRA 来在 `fnlp/moss-moon-002-sft` 数据集上进行微调的样例。
`intern_moss_example.py` demonstrates an example of how to use LoRA for fine-tuning on the `fnlp/moss-moon-002-sft` dataset.