..
test_async_engine
[Inference] Fix bugs and docs for feat/online-server ( #5598 )
2024-05-08 15:20:53 +00:00
test_kernels
[release] update version ( #6041 )
2024-09-10 10:31:09 +08:00
test_models
[Inference] Fix flash-attn import and add model test ( #5794 )
2024-06-12 14:13:50 +08:00
__init__.py
[Fix] Fix Inference Example, Tests, and Requirements ( #5688 )
2024-05-08 11:30:15 +08:00
_utils.py
[Inference] Add the logic of the inference engine ( #5173 )
2024-01-11 13:39:56 +00:00
test_batch_bucket.py
[Fix/Inference] Fix format of input prompts and input model in inference engine ( #5395 )
2024-02-23 10:51:35 +08:00
test_config_and_struct.py
[Fix] Fix Inference Example, Tests, and Requirements ( #5688 )
2024-05-08 11:30:15 +08:00
test_continuous_batching.py
[inference] Fix running time of test_continuous_batching ( #5750 )
2024-05-24 19:34:15 +08:00
test_cuda_graph.py
[Fix] Fix Inference Example, Tests, and Requirements ( #5688 )
2024-05-08 11:30:15 +08:00
test_drafter.py
[Fix] Fix Inference Example, Tests, and Requirements ( #5688 )
2024-05-08 11:30:15 +08:00
test_inference_engine.py
[Inference] Fix bugs and docs for feat/online-server ( #5598 )
2024-05-08 15:20:53 +00:00
test_kvcache_manager.py
[Fix] Fix & Update Inference Tests (compatibility w/ main)
2024-05-05 16:28:56 +00:00
test_request_handler.py
[Fix] Fix & Update Inference Tests (compatibility w/ main)
2024-05-05 16:28:56 +00:00
test_rpc_engine.py
[release] update version ( #5752 )
2024-05-31 19:40:26 +08:00
test_streamingllm.py
[Inference]Add Streaming LLM ( #5745 )
2024-06-05 10:51:19 +08:00