Commit Graph

295 Commits (544b7a38a167cb05cdc7590cfc100e23c0ed5ab7)

Author SHA1 Message Date
Insu Jang 00525f7772
[shardformer] fix pipeline forward error if custom layer distribution is used (#5189)
* Use self.[distribute_layers|get_stage_index] to exploit custom layer distribution

* Change static methods for t5 layer distribution to member functions

* Change static methods for whisper layer distribution to member functions

* Replace whisper policy usage with self one

* Fix test case to use non-static layer distribution methods

* fix: fix typo

---------

Co-authored-by: Wenhao Chen <cwher@outlook.com>
2024-03-27 13:57:00 +08:00
Wenhao Chen bb0a668fee
[hotfix] set return_outputs=False in examples and polish code (#5404)
* fix: simplify merge_batch

* fix: use return_outputs=False to eliminate extra memory consumption

* feat: add return_outputs warning

* style: remove `return_outputs=False` as it is the default value
2024-03-25 12:31:09 +08:00
binmakeswell d158fc0e64
[doc] update open-sora demo (#5479)
* [doc] update open-sora demo

* [doc] update open-sora demo

* [doc] update open-sora demo
2024-03-20 16:08:41 +08:00
digger yu 385e85afd4
[hotfix] fix typo s/keywrods/keywords etc. (#5429) 2024-03-12 11:25:16 +08:00
Camille Zhong da885ed540
fix tensor data update for gemini loss caluculation (#5442) 2024-03-11 13:49:58 +08:00
Camille Zhong 743e7fad2f
[colossal-llama2] add stream chat examlple for chat version model (#5428)
* add stream chat for chat version

* remove os.system clear

* modify function name
2024-03-07 14:58:56 +08:00
hugo-syn c8003d463b
[doc] Fix typo s/infered/inferred/ (#5288)
Signed-off-by: hugo-syn <hugo.vincent@synacktiv.com>
2024-03-05 22:02:08 +08:00
Dongruixuan Li a7ae2b5b4c
[eval-hotfix] set few_shot_data to None when few shot is disabled (#5422) 2024-03-05 21:48:55 +08:00
binmakeswell 822241a99c
[doc] sora release (#5425)
* [doc] sora release

* [doc] sora release

* [doc] sora release

* [doc] sora release
2024-03-05 12:08:58 +08:00
Camille Zhong 4b8312c08e
fix sft single turn inference example (#5416) 2024-03-01 17:27:50 +08:00
Tong Li a28c971516
update requirements (#5407) 2024-02-28 17:46:27 +08:00
CZYCW b833153fd5
[hotfix] fix variable type for top_p (#5313)
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
2024-02-19 18:25:44 +08:00
Hongxin Liu 7303801854
[llama] fix training and inference scripts (#5384)
* [llama] refactor inference example to fit sft

* [llama] fix training script to fit gemini

* [llama] fix inference script
2024-02-19 16:41:04 +08:00
Frank Lee efef43b53c
Merge pull request #5372 from hpcaitech/exp/mixtral 2024-02-08 16:30:05 +08:00
Hongxin Liu 65e5d6baa5 [moe] fix mixtral optim checkpoint (#5344) 2024-02-07 19:21:02 +08:00
Hongxin Liu 956b561b54 [moe] fix mixtral forward default value (#5329) 2024-02-07 19:21:02 +08:00
Hongxin Liu b60be18dcc [moe] fix mixtral checkpoint io (#5314) 2024-02-07 19:21:02 +08:00
Hongxin Liu da39d21b71 [moe] support mixtral (#5309)
* [moe] add mixtral block for single expert

* [moe] mixtral block fwd support uneven ep

* [moe] mixtral block bwd support uneven ep

* [moe] add mixtral moe layer

* [moe] simplify replace

* [meo] support save sharded mixtral

* [meo] support load sharded mixtral

* [meo] support save sharded optim

* [meo] integrate moe manager into plug

* [meo] fix optimizer load

* [meo] fix mixtral layer
2024-02-07 19:21:02 +08:00
Hongxin Liu c904d2ae99 [moe] update capacity computing (#5253)
* [moe] top2 allow uneven input

* [moe] update capacity computing

* [moe] remove debug info

* [moe] update capacity computing

* [moe] update capacity computing
2024-02-07 19:21:02 +08:00
Xuanlei Zhao 7d8e0338a4 [moe] init mixtral impl 2024-02-07 19:21:02 +08:00
Hongxin Liu 084c91246c
[llama] fix memory issue (#5371)
* [llama] fix memory issue

* [llama] add comment
2024-02-06 19:02:37 +08:00
Hongxin Liu eb4f2d90f9
[llama] polish training script and fix optim ckpt (#5368) 2024-02-06 11:52:17 +08:00
Camille Zhong a5756a8720
[eval] update llama npu eval (#5366) 2024-02-06 10:53:03 +08:00
Camille Zhong 44ca61a22b
[llama] fix neftune & pbar with start_step (#5364) 2024-02-05 18:04:23 +08:00
Hongxin Liu a4cec1715b
[llama] add flash attn patch for npu (#5362) 2024-02-05 16:48:34 +08:00
Hongxin Liu 73f9f23fc6
[llama] update training script (#5360)
* [llama] update training script

* [doc] polish docstr
2024-02-05 16:33:18 +08:00
Hongxin Liu 6c0fa7b9a8
[llama] fix dataloader for hybrid parallel (#5358)
* [plugin] refactor prepare dataloader

* [plugin] update train script
2024-02-05 15:14:56 +08:00
YeAnbang c5239840e6
[Chat] fix sft loss nan (#5345)
* fix script

* fix script

* fix chat nan

* fix chat nan
2024-02-01 14:25:16 +08:00
Frank Lee 8823cc4831
Merge pull request #5310 from hpcaitech/feature/npu
Feature/npu
2024-01-29 13:49:39 +08:00
李文军 ec912b1ba9
[NFC] polish applications/Colossal-LLaMA-2/colossal_llama2/tokenizer/init_tokenizer.py code style (#5228) 2024-01-25 13:14:48 +08:00
Desperado-Jia ddf879e2db
fix bug for mefture (#5299) 2024-01-22 22:17:54 +08:00
Michelle 32cb74493a
fix auto loading gpt2 tokenizer (#5279) 2024-01-18 14:08:29 +08:00
ver217 148469348a Merge branch 'main' into sync/npu 2024-01-18 12:05:21 +08:00
digger yu 756c400ad2
fix typo in applications/ColossalEval/README.md (#5250) 2024-01-11 17:58:38 +08:00
digger yu 41e52c1c6e
[doc] fix typo in Colossal-LLaMA-2/README.md (#5247) 2024-01-10 19:24:56 +08:00
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
* update accelerator

* fix timer

* fix amp

* update

* fix

* update bug

* add error raise

* fix autocast

* fix set device

* remove doc accelerator

* update doc

* update doc

* update doc

* use nullcontext

* update cpu

* update null context

* change time limit for example

* udpate

* update

* update

* update

* [npu] polish accelerator code

---------

Co-authored-by: Xuanlei Zhao <xuanlei.zhao@gmail.com>
Co-authored-by: zxl <43881818+oahzxl@users.noreply.github.com>
2024-01-09 10:20:05 +08:00
binmakeswell 7bc6969ce6
[doc] SwiftInfer release (#5236)
* [doc] SwiftInfer release

* [doc] SwiftInfer release

* [doc] SwiftInfer release

* [doc] SwiftInfer release

* [doc] SwiftInfer release
2024-01-08 09:55:12 +08:00
github-actions[bot] 4fb4a22a72
[format] applied code formatting on changed files in pull request 5234 (#5235)
Co-authored-by: github-actions <github-actions@github.com>
2024-01-07 20:55:34 +08:00
binmakeswell b9b32b15e6
[doc] add Colossal-LLaMA-2-13B (#5234)
* [doc] add Colossal-LLaMA-2-13B

* [doc] add Colossal-LLaMA-2-13B

* [doc] add Colossal-LLaMA-2-13B
2024-01-07 20:53:12 +08:00
Camille Zhong 915b4652f3
[doc] Update README.md of Colossal-LLAMA2 (#5233)
* Update README.md

* Update README.md
2024-01-06 17:06:41 +08:00
Tong Li d992b55968
[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model (#5224)
* update readme

* update readme

* update link

* update

* update readme

* update

* update

* update

* update title

* update example

* update example

* fix content

* add conclusion

* add license

* update

* update

* update version

* fix minor
2024-01-05 17:24:26 +08:00
Yuanchen eae01b6740
Improve logic for selecting metrics (#5196)
Co-authored-by: Xu <yuanchen.xu00@gmail.com>
2023-12-22 14:52:50 +08:00
BlueRum af952673f7
polish readme in application/chat (#5194) 2023-12-20 11:28:39 +08:00
Yuanchen 3ff60d13b0
Fix ColossalEval (#5186)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-12-15 15:06:06 +08:00
Yuanchen cefdc32615
[ColossalEval] Support GSM, Data Leakage Evaluation and Tensor Parallel (#5169)
* Support GSM, Data Leakage Evaluation and Tensor Parallel

* remove redundant code and update inference.py in examples/gpt_evaluation

---------

Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-12-12 14:47:35 +08:00
Michelle b07a6f4e27
[colossalqa] fix pangu api (#5170)
* fix pangu api

* add comment
2023-12-11 14:08:11 +08:00
Yuanchen b397104438
[Colossal-Llama-2] Add finetuning Colossal-Llama-2 example (#4878)
* Add finetuning Colossal-Llama-2 example

* Add finetuning Colossal-Llama-2 example 2

* Add finetuning Colossal-Llama-2 example and support NEFTuning

* Add inference example and refine neftune

* Modify readme file

* update the imports

---------

Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
2023-12-07 14:02:03 +08:00
Michelle 368b5e3d64
[doc] fix colossalqa document (#5146)
* fix doc

* modify doc
2023-12-01 21:39:53 +08:00
Michelle c7fd9a5213
[ColossalQA] refactor server and webui & add new feature (#5138)
* refactor server and webui & add new feature

* add requirements

* modify readme and ui
2023-11-30 22:55:52 +08:00
github-actions[bot] f6731db67c
[format] applied code formatting on changed files in pull request 5115 (#5118)
Co-authored-by: github-actions <github-actions@github.com>
2023-11-29 13:39:14 +08:00
digger yu 9110406a47
fix typo change JOSNL TO JSONL etc. (#5116) 2023-11-29 11:08:32 +08:00
Zian(Andy) Zheng 7b789f4dd2 [FEATURE] Add Safety Eval Datasets to ColossalEval (#5095)
* add safetybench and cvalues(responsibility) eval dataset

* Modify code according to review suggestions

---------

Co-authored-by: Orion-Zheng <zhengzian@u.nus.edu>
2023-11-28 11:15:04 +08:00
digger yu d5661f0f25
[nfc] fix typo change directoty to directory (#5111) 2023-11-27 18:25:53 +08:00
YeAnbang e53e729d8e
[Feature] Add document retrieval QA (#5020)
* add langchain

* add langchain

* Add files via upload

* add langchain

* fix style

* fix style: remove extra space

* add pytest; modified retriever

* add pytest; modified retriever

* add tests to build_on_pr.yml

* fix build_on_pr.yml

* fix build on pr; fix environ vars

* seperate unit tests for colossalqa from build from pr

* fix container setting; fix environ vars

* commented dev code

* add incremental update

* remove stale code

* fix style

* change to sha3 224

* fix retriever; fix style; add unit test for document loader

* fix ci workflow config

* fix ci workflow config

* add set cuda visible device script in ci

* fix doc string

* fix style; update readme; refactored

* add force log info

* change build on pr, ignore colossalqa

* fix docstring, captitalize all initial letters

* fix indexing; fix text-splitter

* remove debug code, update reference

* reset previous commit

* update LICENSE update README add key-value mode, fix bugs

* add files back

* revert force push

* remove junk file

* add test files

* fix retriever bug, add intent classification

* change conversation chain design

* rewrite prompt and conversation chain

* add ui v1

* ui v1

* fix atavar

* add header

* Refactor the RAG Code and support Pangu

* Refactor the ColossalQA chain to Object-Oriented Programming and the UI demo.

* resolved conversation. tested scripts under examples. web demo still buggy

* fix ci tests

* Some modifications to add ChatGPT api

* modify llm.py and remove unnecessary files

* Delete applications/ColossalQA/examples/ui/test_frontend_input.json

* Remove OpenAI api key

* add colossalqa

* move files

* move files

* move files

* move files

* fix style

* Add Readme and fix some bugs.

* Add something to readme and modify some code

* modify a directory name for clarity

* remove redundant directory

* Correct a type in  llm.py

* fix AI prefix

* fix test_memory.py

* fix conversation

* fix some erros and typos

* Fix a missing import in RAG_ChatBot.py

* add colossalcloud LLM wrapper, correct issues in code review

---------

Co-authored-by: YeAnbang <anbangy2@outlook.com>
Co-authored-by: Orion-Zheng <zheng_zian@u.nus.edu>
Co-authored-by: Zian(Andy) Zheng <62330719+Orion-Zheng@users.noreply.github.com>
Co-authored-by: Orion-Zheng <zhengzian@u.nus.edu>
2023-11-23 10:33:48 +08:00
Orion-Zheng 43ad0d9ef0 fix wrong EOS token in ColossalChat 2023-11-14 10:49:49 +08:00
Yuanchen 239cd92eff
Support mtbench (#5025)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-11-09 13:41:50 +08:00
Yuanchen abe071b663
fix ColossalEval (#4992)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-10-31 10:30:03 +08:00
github-actions[bot] a41cf88e9b
[format] applied code formatting on changed files in pull request 4908 (#4918)
Co-authored-by: github-actions <github-actions@github.com>
2023-10-17 10:48:24 +08:00
Zian(Andy) Zheng 7768afbad0 Update flash_attention_patch.py
To be compatible with the new change in the Transformers library, where a new argument 'padding_mask' was added to forward function of attention layer.
https://github.com/huggingface/transformers/pull/25598
2023-10-16 14:00:45 +08:00
Camille Zhong 652adc2215 Update README.md 2023-10-10 23:19:34 +08:00
Camille Zhong afe10a85fd Update README.md 2023-10-10 23:19:34 +08:00
Camille Zhong 3043d5d676 Update modelscope link in README.md
add modelscope link
2023-10-10 23:19:34 +08:00
Tong Li ed06731e00
update Colossal (#4832) 2023-09-28 16:05:05 +08:00
binmakeswell 822051d888
[doc] update slack link (#4823) 2023-09-27 17:37:39 +08:00
Yuanchen 1fa8c5e09f
Update Qwen-7B results (#4821)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-09-27 17:33:54 +08:00
flybird11111 be400a0936
[chat] fix gemini strategy (#4698)
* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* g# This is a combination of 2 commits.

[chat] fix gemini strategy

fox

* [chat] fix gemini strategy

update llama2 example

[chat] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* fix

* fix

* fix

* fix

* fix

* Update train_prompts.py
2023-09-27 13:15:32 +08:00
Chandler-Bing b6cf0aca55
[hotfix] change llama2 Colossal-LLaMA-2 script filename (#4800)
change filename:
pretraining.py -> trainin.py
there is no file named pretraing.py. wrong writing
2023-09-26 11:44:27 +08:00
Tong Li 8cbce6184d update 2023-09-26 11:36:53 +08:00
Tong Li bd014673b0 update readme 2023-09-26 10:58:05 +08:00
binmakeswell d512a4d38d
[doc] add llama2 domain-specific solution news (#4789)
* [doc] add llama2 domain-specific solution news
2023-09-25 10:44:15 +08:00
Yuanchen ce777853ae
[feature] ColossalEval: Evaluation Pipeline for LLMs (#4786)
* Add ColossalEval

* Delete evaluate in Chat

---------

Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
2023-09-24 23:14:11 +08:00
Tong Li 74aa7d964a
initial commit: add colossal llama 2 (#4784) 2023-09-24 23:12:26 +08:00
Wenhao Chen 901ab1eedd
[chat]: add lora merge weights config (#4766)
* feat: modify lora merge weights fn

* feat: add lora merge weights config
2023-09-21 16:23:59 +08:00
Wenhao Chen 7b9b86441f
[chat]: update rm, add wandb and fix bugs (#4471)
* feat: modify forward fn of critic and reward model

* feat: modify calc_action_log_probs

* to: add wandb in sft and rm trainer

* feat: update train_sft

* feat: update train_rm

* style: modify type annotation and add warning

* feat: pass tokenizer to ppo trainer

* to: modify trainer base and maker base

* feat: add wandb in ppo trainer

* feat: pass tokenizer to generate

* test: update generate fn tests

* test: update train tests

* fix: remove action_mask

* feat: remove unused code

* fix: fix wrong ignore_index

* fix: fix mock tokenizer

* chore: update requirements

* revert: modify make_experience

* fix: fix inference

* fix: add padding side

* style: modify _on_learn_batch_end

* test: use mock tokenizer

* fix: use bf16 to avoid overflow

* fix: fix workflow

* [chat] fix gemini strategy

* [chat] fix

* sync: update colossalai strategy

* fix: fix args and model dtype

* fix: fix checkpoint test

* fix: fix requirements

* fix: fix missing import and wrong arg

* fix: temporarily skip gemini test in stage 3

* style: apply pre-commit

* fix: temporarily skip gemini test in stage 1&2

---------

Co-authored-by: Mingyan Jiang <1829166702@qq.com>
2023-09-20 15:53:58 +08:00
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
* [misc] update pre-commit

* [misc] run pre-commit

* [misc] remove useless configuration files

* [misc] ignore cuda for clang-format
2023-09-19 14:20:26 +08:00
digger yu e4fc57c3de
Optimized some syntax errors in the documentation and code under applications/ (#4127)
Co-authored-by: flybird11111 <1829166702@qq.com>
2023-09-15 14:18:22 +08:00
Hongxin Liu a39a5c66fe
Merge branch 'main' into feature/shardformer 2023-09-04 23:43:13 +08:00
Ying Liu c648dc093f fix colossalai version in coati examples 2023-08-30 11:14:19 +08:00
yingliu-hpc 1467e3b41b
[coati] add chatglm model (#4539)
* update configuration of chatglm and add support in coati

* add unit test & update chatglm default config & fix bos index issue

* remove chatglm due to oom

* add dataset pkg in requirement-text

* fix parameter issue in test_models

* add ref in tokenize & rm unnessary parts

* separate source & target tokenization in chatglm

* add unit test to chatglm

* fix test dataset issue

* update truncation of chatglm

* fix Colossalai version

* fix colossal ai version in test
2023-08-29 17:58:51 +08:00
Michelle 285fe7ba71
[chat] update config and prompt (#4139)
* update config and prompt

* update config

---------

Co-authored-by: Qianran Ma <qianranm@luchentech.com>
2023-08-21 14:30:25 +08:00
Hongxin Liu 26e29d58f0
[devops] add large-scale distributed test marker (#4452)
* [test] remove cpu marker

* [test] remove gpu marker

* [test] update pytest markers

* [ci] update unit test ci
2023-08-16 18:56:52 +08:00
Wenhao Chen 6d41c3f2aa
[doc] update Coati README (#4405)
* style: apply formatter

* fix: add outdated warnings

* docs: add dataset format and polish

* docs: polish README

* fix: fix json format

* fix: fix typos

* revert: revert 7b example
2023-08-14 15:26:27 +08:00
Wenhao Chen da4f7b855f
[chat] fix bugs and add unit tests (#4213)
* style: rename replay buffer

Experience replay is typically for off policy algorithms.
Use this name in PPO maybe misleading.

* fix: fix wrong zero2 default arg

* test: update experience tests

* style: rename zero_pad fn

* fix: defer init in CycledDataLoader

* test: add benchmark test

* style: rename internal fn of generation

* style: rename internal fn of lora

* fix: remove unused loss fn

* fix: remove unused utils fn

* refactor: remove generate_with_actor fn

* fix: fix type annotation

* test: add models tests

* fix: skip llama due to long execution time

* style: modify dataset

* style: apply formatter

* perf: update reward dataset

* fix: fix wrong IGNORE_INDEX in sft dataset

* fix: remove DataCollatorForSupervisedDataset

* test: add dataset tests

* style: apply formatter

* style: rename test_ci to test_train

* feat: add llama in inference

* test: add inference tests

* test: change test scripts directory

* fix: update ci

* fix: fix typo

* fix: skip llama due to oom

* fix: fix file mod

* style: apply formatter

* refactor: remove duplicated llama_gptq

* style: apply formatter

* to: update rm test

* feat: add tokenizer arg

* feat: add download model script

* test: update train tests

* fix: modify gemini load and save pretrained

* test: update checkpoint io test

* to: modify nproc_per_node

* fix: do not remove existing dir

* fix: modify save path

* test: add random choice

* fix: fix sft path

* fix: enlarge nproc_per_node to avoid oom

* fix: add num_retry

* fix: make lora config of rm and critic consistent

* fix: add warning about lora weights

* fix: skip some gpt2 tests

* fix: remove grad ckpt in rm and critic due to errors

* refactor: directly use Actor in train_sft

* test: add more arguments

* fix: disable grad ckpt when using lora

* fix: fix save_pretrained and related tests

* test: enable zero2 tests

* revert: remove useless fn

* style: polish code

* test: modify test args
2023-08-02 10:17:36 +08:00
Wenhao Chen 75c5389037
[chat] fix compute_approx_kl (#4338) 2023-08-01 10:21:45 +08:00
Yuanchen 5187c96b7c
support session-based training (#4313)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-07-28 11:29:55 +08:00
yuxuan-lou 0991405361 [NFC] polish applications/Chat/coati/models/utils.py codestyle (#4277)
* [NFC] polish colossalai/context/random/__init__.py code style

* [NFC] polish applications/Chat/coati/models/utils.py code style
2023-07-26 14:12:57 +08:00
Zirui Zhu 9e512938f6 [NFC] polish applications/Chat/coati/trainer/strategies/base.py code style (#4278) 2023-07-26 14:12:57 +08:00
Ziheng Qin c972d65311 applications/Chat/.gitignore (#4279)
Co-authored-by: henryqin1997 <henryqin1997@gamil.com>
2023-07-26 14:12:57 +08:00
RichardoLuo 709e121cd5 [NFC] polish applications/Chat/coati/models/generation.py code style (#4275) 2023-07-26 14:12:57 +08:00
Yuanchen dc1b6127f9 [NFC] polish applications/Chat/inference/server.py code style (#4274)
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
2023-07-26 14:12:57 +08:00
アマデウス caa4433072 [NFC] fix format of application/Chat/coati/trainer/utils.py (#4273) 2023-07-26 14:12:57 +08:00
Xu Kai 1ce997daaf [NFC] polish applications/Chat/examples/train_reward_model.py code style (#4271) 2023-07-26 14:12:57 +08:00
shenggan 798cb72907 [NFC] polish applications/Chat/coati/trainer/base.py code style (#4260) 2023-07-26 14:12:57 +08:00
Zheng Zangwei (Alex Zheng) b2debdc09b [NFC] polish applications/Chat/coati/dataset/sft_dataset.py code style (#4259) 2023-07-26 14:12:57 +08:00
CZYCW dee1c96344 [NFC] policy applications/Chat/examples/ray/mmmt_prompt.py code style (#4250) 2023-07-26 14:12:57 +08:00
Junming Wu 77c469e1ba [NFC] polish applications/Chat/coati/models/base/actor.py code style (#4248) 2023-07-26 14:12:57 +08:00
Camille Zhong 915ed8bed1 [NFC] polish applications/Chat/inference/requirements.txt code style (#4265) 2023-07-26 14:12:57 +08:00
Frank Lee f447ca1811 [chat] removed cache file (#4155) 2023-07-04 16:05:01 +08:00
wukong1992 c1c672d0f0 [shardformer] shardformer support t5 model (#3994)
test t5
2023-07-04 16:05:01 +08:00
Wenhao Chen 3d8d5d0d58
[chat] use official transformers and fix some issues (#4117)
* feat: remove on_learn_epoch fn as not used

* revert: add _on_learn_epoch fn

* feat: remove NaiveStrategy

* test: update train_prompts tests

* fix: remove prepare_llama_tokenizer_and_embedding

* test: add lora arg

* feat: remove roberta support in train_prompts due to runtime errs

* feat: remove deberta & roberta in rm as not used

* test: remove deberta and roberta tests

* feat: remove deberta and roberta models as not used

* fix: remove calls to roberta

* fix: remove prepare_llama_tokenizer_and_embedding

* chore: update transformers version

* docs: update transformers version

* fix: fix actor inference

* fix: fix ci

* feat: change llama pad token to unk

* revert: revert ddp setup_distributed

* fix: change llama pad token to unk

* revert: undo unnecessary changes

* fix: use pip to install transformers
2023-07-04 13:49:09 +08:00