ColossalAI/colossalai/inference/core
Yuanheng Zhao 7b249c76e5
[Fix] Fix spec-dec Glide LlamaModel for compatibility with transformers (#5837)
* fix glide llama model

* revise
2024-06-19 15:37:53 +08:00
..
__init__.py [doc] updated inference readme (#5343) 2024-02-02 14:31:10 +08:00
async_engine.py [Inference] Fix API server, test and example (#5712) 2024-05-15 15:47:31 +08:00
engine.py [Fix] Fix spec-dec Glide LlamaModel for compatibility with transformers (#5837) 2024-06-19 15:37:53 +08:00
plugin.py [Feat]Tensor Model Parallel Support For Inference (#5563) 2024-04-18 16:56:46 +08:00
request_handler.py [Inference]Add Streaming LLM (#5745) 2024-06-05 10:51:19 +08:00
rpc_engine.py [Inference] Fix Inference Generation Config and Sampling (#5710) 2024-05-19 15:08:42 +08:00