Making large AI models cheaper, faster and more accessible
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
Wenxuan Tan 8fd25d6e09
[Feature] Split cross-entropy computation in SP (#5959)
2 months ago
..
__init__.py [moe] deepseek moe sp support 4 months ago
albert.py [misc] update pre-commit and run all files (#4752) 1 year ago
bert.py [test] merge old components to test to model zoo (#4945) 1 year ago
blip2.py [test] merge old components to test to model zoo (#4945) 1 year ago
bloom.py [test] merge old components to test to model zoo (#4945) 1 year ago
chatglm2.py [test] fix chatglm test kit (#5793) 5 months ago
command.py [Feature] Zigzag Ring attention (#5905) 3 months ago
deepseek.py [misc] remove debug/print code 4 months ago
falcon.py [shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088) 1 year ago
gpt.py [Feature] Split cross-entropy computation in SP (#5959) 2 months ago
gptj.py [workflow] fixed oom tests (#5275) 10 months ago
llama.py [Feature] Zigzag Ring attention (#5905) 3 months ago
mistral.py [Feature] Zigzag Ring attention (#5905) 3 months ago
mixtral.py [Feature] MoE Ulysses Support (#5918) 4 months ago
opt.py [test] merge old components to test to model zoo (#4945) 1 year ago
qwen2.py [Feature] Zigzag Ring attention (#5905) 3 months ago
sam.py [test] merge old components to test to model zoo (#4945) 1 year ago
t5.py [shardformer] Support the T5ForTokenClassification model (#5816) 5 months ago
vit.py [test] merge old components to test to model zoo (#4945) 1 year ago
whisper.py [test] merge old components to test to model zoo (#4945) 1 year ago