Making large AI models cheaper, faster and more accessible
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
Xu Kai fd6482ad8c
[inference] Refactor inference architecture (#5057)
1 year ago
..
__init__.py
auto_policy.py [inference] Refactor inference architecture (#5057) 1 year ago
base_policy.py
bert.py
blip2.py
bloom.py
chatglm2.py [Inference] Fix bug in ChatGLM2 Tensor Parallelism (#5014) 1 year ago
gpt2.py
llama.py
opt.py
sam.py
t5.py [gemini] gemini support tensor parallelism. (#4942) 1 year ago
vit.py
whisper.py