InternLM/tools/transformers
Sun Peng e0d6a3f84f
[Develop] Pull Main Branch (#121)
* fix/fix_submodule_err (#61)

* fix/fix_submodule_err

---------

Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>

* fix issue templates (#65)

* fix(tokenizer): refactor tokenizer and update usage in readme (#51)

* update tokenizer example

* fix(readme, requirements): fix typo at Chinese readme and select a lower version of transformers (#73)

* fix a typo in readme

* in order to find InternLMTokenizer, select a lower version of Transformers

---------

Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>

* [Doc] Add wechat and discord link in readme (#78)

* Doc:add wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* Doc:update wechat and discord link

* [Docs]: add Japanese README (#43)

* Add Japanese README

* Update README-ja-JP.md

replace message

* Update README-ja-JP.md

* add repetition_penalty in GenerationConfig in web_demo.py (#48)

Co-authored-by: YWMditto <862779238@qq.com>

* use fp16 in instruction (#80)

* [Enchancement] add more options for issue template (#77)

* [Enchancement] add more options for issue template

* update qustion icon

* fix link

* Use tempfile for convert2hf.py (#23)

Fix https://github.com/InternLM/InternLM/issues/50

* delete torch_dtype of README's example code (#100)

* set the value of repetition_penalty to 1.0 to avoid random outputs (#99)

* Update web_demo.py (#97)

Remove meaningless log.

* [Fix]Fix wrong string cutoff in the script for sft text tokenizing (#106)

---------

Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu>
Co-authored-by: Kai Chen <chenkaidev@gmail.com>
Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com>
Co-authored-by: Changjiang GOU <gouchangjiang@gmail.com>
Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com>
Co-authored-by: vansin <msnode@163.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: YWMditto <46778265+YWMditto@users.noreply.github.com>
Co-authored-by: YWMditto <862779238@qq.com>
Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com>
Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com>
Co-authored-by: Shuo Zhang <zhangshuolove@live.com>
Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>
2023-07-21 20:44:33 +08:00
..
README-zh-Hans.md [Develop] Pull Main Branch (#121) 2023-07-21 20:44:33 +08:00
README.md [Develop] Pull Main Branch (#121) 2023-07-21 20:44:33 +08:00
configuration_internlm.py initial commit 2023-07-06 12:55:23 +08:00
convert2hf.py [Develop] Pull Main Branch (#121) 2023-07-21 20:44:33 +08:00
intern_moss_example.py initial commit 2023-07-06 12:55:23 +08:00
internlm_sft_on_moss.py initial commit 2023-07-06 12:55:23 +08:00
modeling_internlm.py [Develop] Pull Main Branch (#121) 2023-07-21 20:44:33 +08:00
tokenization_internlm.py initial commit 2023-07-06 12:55:23 +08:00

README.md

InternLM Transformers

English | 简体中文

This folder contains the InternLM model in transformers format.

Weight Conversion

convert2hf.py can convert saved training weights into the transformers format with a single command. Execute the command in the root directory of repository:

python tools/transformers/convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer ./tools/V7_sft.model

Then, you can load it using the from_pretrained interface:

>>> from transformers import AutoTokenizer, AutoModel
>>> model = AutoModel.from_pretrained("hf_ckpt/", trust_remote_code=True).cuda()

intern_moss_example.py demonstrates an example of how to use LoRA for fine-tuning on the fnlp/moss-moon-002-sft dataset.