ColossalAI/tests/test_infer/test_ops
Yuanheng Zhao a37f82629d [Inference/SpecDec] Add Speculative Decoding Implementation (#5423)
* fix flash decoding mask during verification

* add spec-dec

* add test for spec-dec

* revise drafter init

* remove drafter sampling

* retire past kv in drafter

* (trivial) rename attrs

* (trivial) rename arg

* revise how we enable/disable spec-dec
2024-04-10 11:07:52 +08:00
..
cuda [Inference/Kernel]Add get_cos_and_sin Kernel (#5528) 2024-04-01 13:47:14 +08:00
triton [Inference/SpecDec] Add Speculative Decoding Implementation (#5423) 2024-04-10 11:07:52 +08:00
__init__.py [Inference]Add CUDA KVCache Kernel (#5406) 2024-02-28 14:36:50 +08:00