Insu Jang
00525f7772
[shardformer] fix pipeline forward error if custom layer distribution is used ( #5189 )
...
* Use self.[distribute_layers|get_stage_index] to exploit custom layer distribution
* Change static methods for t5 layer distribution to member functions
* Change static methods for whisper layer distribution to member functions
* Replace whisper policy usage with self one
* Fix test case to use non-static layer distribution methods
* fix: fix typo
---------
Co-authored-by: Wenhao Chen <cwher@outlook.com>
2024-03-27 13:57:00 +08:00
flybird11111
79718fae04
[shardformer] llama support DistCrossEntropy ( #5176 )
...
* fix
aaa
fix
fix
fix
* fix
* fix
* test ci
* fix ci
fix
* llama support dist-cross
fix
fix
fix
fix
fix
fix
fix
fix
* fix
* fix
* fix
fix
* test ci
* test ci
* fix
* [Colossal-Llama-2] Add finetuning Colossal-Llama-2 example (#4878 )
* Add finetuning Colossal-Llama-2 example
* Add finetuning Colossal-Llama-2 example 2
* Add finetuning Colossal-Llama-2 example and support NEFTuning
* Add inference example and refine neftune
* Modify readme file
* update the imports
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
* llama support dist-cross
fix
fix
fix
fix
fix
fix
fix
fix
* fix
* fix
* fix
fix
* test ci
* test ci
* fix
* fix ci
* fix ci
---------
Co-authored-by: Yuanchen <70520919+chengeharrison@users.noreply.github.com>
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
2023-12-13 01:39:14 +08:00
Hongxin Liu
079bf3cb26
[misc] update pre-commit and run all files ( #4752 )
...
* [misc] update pre-commit
* [misc] run pre-commit
* [misc] remove useless configuration files
* [misc] ignore cuda for clang-format
2023-09-19 14:20:26 +08:00
Frank Lee
f22ddacef0
[shardformer] refactored the shardformer layer structure ( #4053 )
2023-07-04 16:05:01 +08:00