Hongxin Liu
641b1ee71a
[devops] remove post commit ci ( #5566 )
...
* [devops] remove post commit ci
* [misc] run pre-commit on all files
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
8 months ago
Camille Zhong
da885ed540
fix tensor data update for gemini loss caluculation ( #5442 )
9 months ago
Camille Zhong
743e7fad2f
[colossal-llama2] add stream chat examlple for chat version model ( #5428 )
...
* add stream chat for chat version
* remove os.system clear
* modify function name
9 months ago
Camille Zhong
4b8312c08e
fix sft single turn inference example ( #5416 )
9 months ago
Tong Li
a28c971516
update requirements ( #5407 )
9 months ago
CZYCW
b833153fd5
[hotfix] fix variable type for top_p ( #5313 )
...
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
9 months ago
Hongxin Liu
7303801854
[llama] fix training and inference scripts ( #5384 )
...
* [llama] refactor inference example to fit sft
* [llama] fix training script to fit gemini
* [llama] fix inference script
9 months ago
Hongxin Liu
084c91246c
[llama] fix memory issue ( #5371 )
...
* [llama] fix memory issue
* [llama] add comment
10 months ago
Hongxin Liu
eb4f2d90f9
[llama] polish training script and fix optim ckpt ( #5368 )
10 months ago
Camille Zhong
44ca61a22b
[llama] fix neftune & pbar with start_step ( #5364 )
10 months ago
Hongxin Liu
a4cec1715b
[llama] add flash attn patch for npu ( #5362 )
10 months ago
Hongxin Liu
73f9f23fc6
[llama] update training script ( #5360 )
...
* [llama] update training script
* [doc] polish docstr
10 months ago
Hongxin Liu
6c0fa7b9a8
[llama] fix dataloader for hybrid parallel ( #5358 )
...
* [plugin] refactor prepare dataloader
* [plugin] update train script
10 months ago
Frank Lee
8823cc4831
Merge pull request #5310 from hpcaitech/feature/npu
...
Feature/npu
10 months ago
李文军
ec912b1ba9
[NFC] polish applications/Colossal-LLaMA-2/colossal_llama2/tokenizer/init_tokenizer.py code style ( #5228 )
10 months ago
Desperado-Jia
ddf879e2db
fix bug for mefture ( #5299 )
10 months ago
ver217
148469348a
Merge branch 'main' into sync/npu
11 months ago
digger yu
41e52c1c6e
[doc] fix typo in Colossal-LLaMA-2/README.md ( #5247 )
11 months ago
Hongxin Liu
d202cc28c0
[npu] change device to accelerator api ( #5239 )
...
* update accelerator
* fix timer
* fix amp
* update
* fix
* update bug
* add error raise
* fix autocast
* fix set device
* remove doc accelerator
* update doc
* update doc
* update doc
* use nullcontext
* update cpu
* update null context
* change time limit for example
* udpate
* update
* update
* update
* [npu] polish accelerator code
---------
Co-authored-by: Xuanlei Zhao <xuanlei.zhao@gmail.com>
Co-authored-by: zxl <43881818+oahzxl@users.noreply.github.com>
11 months ago
github-actions[bot]
4fb4a22a72
[format] applied code formatting on changed files in pull request 5234 ( #5235 )
...
Co-authored-by: github-actions <github-actions@github.com>
11 months ago
binmakeswell
b9b32b15e6
[doc] add Colossal-LLaMA-2-13B ( #5234 )
...
* [doc] add Colossal-LLaMA-2-13B
* [doc] add Colossal-LLaMA-2-13B
* [doc] add Colossal-LLaMA-2-13B
11 months ago
Camille Zhong
915b4652f3
[doc] Update README.md of Colossal-LLAMA2 ( #5233 )
...
* Update README.md
* Update README.md
11 months ago
Tong Li
d992b55968
[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model ( #5224 )
...
* update readme
* update readme
* update link
* update
* update readme
* update
* update
* update
* update title
* update example
* update example
* fix content
* add conclusion
* add license
* update
* update
* update version
* fix minor
11 months ago
Yuanchen
b397104438
[Colossal-Llama-2] Add finetuning Colossal-Llama-2 example ( #4878 )
...
* Add finetuning Colossal-Llama-2 example
* Add finetuning Colossal-Llama-2 example 2
* Add finetuning Colossal-Llama-2 example and support NEFTuning
* Add inference example and refine neftune
* Modify readme file
* update the imports
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
12 months ago
digger yu
9110406a47
fix typo change JOSNL TO JSONL etc. ( #5116 )
1 year ago
digger yu
d5661f0f25
[nfc] fix typo change directoty to directory ( #5111 )
1 year ago
github-actions[bot]
a41cf88e9b
[format] applied code formatting on changed files in pull request 4908 ( #4918 )
...
Co-authored-by: github-actions <github-actions@github.com>
1 year ago
Zian(Andy) Zheng
7768afbad0
Update flash_attention_patch.py
...
To be compatible with the new change in the Transformers library, where a new argument 'padding_mask' was added to forward function of attention layer.
https://github.com/huggingface/transformers/pull/25598
1 year ago
Camille Zhong
652adc2215
Update README.md
1 year ago
Camille Zhong
afe10a85fd
Update README.md
1 year ago
Camille Zhong
3043d5d676
Update modelscope link in README.md
...
add modelscope link
1 year ago
Yuanchen
1fa8c5e09f
Update Qwen-7B results ( #4821 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
1 year ago
Chandler-Bing
b6cf0aca55
[hotfix] change llama2 Colossal-LLaMA-2 script filename ( #4800 )
...
change filename:
pretraining.py -> trainin.py
there is no file named pretraing.py. wrong writing
1 year ago
Tong Li
8cbce6184d
update
1 year ago
Tong Li
bd014673b0
update readme
1 year ago
binmakeswell
d512a4d38d
[doc] add llama2 domain-specific solution news ( #4789 )
...
* [doc] add llama2 domain-specific solution news
1 year ago
Tong Li
74aa7d964a
initial commit: add colossal llama 2 ( #4784 )
1 year ago