.. |
test_async_engine
|
[Inference] Fix bugs and docs for feat/online-server (#5598)
|
2024-05-08 15:20:53 +00:00 |
test_kernels
|
[release] update version (#5752)
|
2024-05-31 19:40:26 +08:00 |
test_models
|
Pass inference model shard configs for module init
|
2024-06-07 08:33:52 +00:00 |
__init__.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
_utils.py
|
[Inference] Add the logic of the inference engine (#5173)
|
2024-01-11 13:39:56 +00:00 |
test_batch_bucket.py
|
[Fix/Inference] Fix format of input prompts and input model in inference engine (#5395)
|
2024-02-23 10:51:35 +08:00 |
test_config_and_struct.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
test_continuous_batching.py
|
[inference] Fix running time of test_continuous_batching (#5750)
|
2024-05-24 19:34:15 +08:00 |
test_cuda_graph.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
test_drafter.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
test_inference_engine.py
|
[Inference] Fix bugs and docs for feat/online-server (#5598)
|
2024-05-08 15:20:53 +00:00 |
test_kvcache_manager.py
|
[Fix] Fix & Update Inference Tests (compatibility w/ main)
|
2024-05-05 16:28:56 +00:00 |
test_request_handler.py
|
[Fix] Fix & Update Inference Tests (compatibility w/ main)
|
2024-05-05 16:28:56 +00:00 |
test_rpc_engine.py
|
[release] update version (#5752)
|
2024-05-31 19:40:26 +08:00 |
test_streamingllm.py
|
[Inference]Add Streaming LLM (#5745)
|
2024-06-05 10:51:19 +08:00 |