.. |
test_async_engine
|
[Inference] Fix bugs and docs for feat/online-server (#5598)
|
2024-05-08 15:20:53 +00:00 |
test_kernels
|
add paged-attetionv2: support seq length split across thread block (#5707)
|
2024-05-14 12:46:54 +08:00 |
test_models
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
__init__.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
_utils.py
|
[Inference] Add the logic of the inference engine (#5173)
|
2024-01-11 13:39:56 +00:00 |
test_batch_bucket.py
|
[Fix/Inference] Fix format of input prompts and input model in inference engine (#5395)
|
2024-02-23 10:51:35 +08:00 |
test_config_and_struct.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
test_continuous_batching.py
|
[inference] Fix running time of test_continuous_batching (#5750)
|
2024-05-24 19:34:15 +08:00 |
test_cuda_graph.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
test_drafter.py
|
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
|
2024-05-08 11:30:15 +08:00 |
test_inference_engine.py
|
[Inference] Fix bugs and docs for feat/online-server (#5598)
|
2024-05-08 15:20:53 +00:00 |
test_kvcache_manager.py
|
[Fix] Fix & Update Inference Tests (compatibility w/ main)
|
2024-05-05 16:28:56 +00:00 |
test_request_handler.py
|
[Fix] Fix & Update Inference Tests (compatibility w/ main)
|
2024-05-05 16:28:56 +00:00 |
test_rpc_engine.py
|
[Feat]Inference RPC Server Support (#5705)
|
2024-05-14 10:00:55 +08:00 |