.. |
__init__.py
|
[moe] deepseek moe sp support
|
2024-08-01 10:06:59 +08:00 |
albert.py
|
[misc] update pre-commit and run all files (#4752)
|
2023-09-19 14:20:26 +08:00 |
bert.py
|
[test] merge old components to test to model zoo (#4945)
|
2023-10-20 10:35:08 +08:00 |
blip2.py
|
[test] merge old components to test to model zoo (#4945)
|
2023-10-20 10:35:08 +08:00 |
bloom.py
|
[test] merge old components to test to model zoo (#4945)
|
2023-10-20 10:35:08 +08:00 |
chatglm2.py
|
[test] fix chatglm test kit (#5793)
|
2024-06-11 16:54:31 +08:00 |
command.py
|
fix precommit
|
2024-06-18 02:31:33 +00:00 |
deepseek.py
|
[misc] remove debug/print code
|
2024-08-01 10:06:59 +08:00 |
falcon.py
|
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
|
2023-11-28 16:54:42 +08:00 |
gpt.py
|
[Test/CI] remove test cases to reduce CI duration (#5753)
|
2024-06-05 11:29:04 +08:00 |
gptj.py
|
[workflow] fixed oom tests (#5275)
|
2024-01-16 18:55:13 +08:00 |
llama.py
|
[shardformer] update transformers (#5583)
|
2024-04-24 22:51:50 +08:00 |
mistral.py
|
[shardformer] update transformers (#5583)
|
2024-04-24 22:51:50 +08:00 |
mixtral.py
|
[Feature] MoE Ulysses Support (#5918)
|
2024-08-01 10:06:59 +08:00 |
opt.py
|
[test] merge old components to test to model zoo (#4945)
|
2023-10-20 10:35:08 +08:00 |
qwen2.py
|
[Shardformer] Support the Qwen2 model (#5699)
|
2024-05-09 20:04:25 +08:00 |
sam.py
|
[test] merge old components to test to model zoo (#4945)
|
2023-10-20 10:35:08 +08:00 |
t5.py
|
[shardformer] Support the T5ForTokenClassification model (#5816)
|
2024-06-27 16:40:38 +08:00 |
vit.py
|
[test] merge old components to test to model zoo (#4945)
|
2023-10-20 10:35:08 +08:00 |
whisper.py
|
[test] merge old components to test to model zoo (#4945)
|
2023-10-20 10:35:08 +08:00 |