ColossalAI/colossalai/kernel
Yuanheng Zhao a37f82629d [Inference/SpecDec] Add Speculative Decoding Implementation (#5423)
* fix flash decoding mask during verification

* add spec-dec

* add test for spec-dec

* revise drafter init

* remove drafter sampling

* retire past kv in drafter

* (trivial) rename attrs

* (trivial) rename arg

* revise how we enable/disable spec-dec
2024-04-10 11:07:52 +08:00
..
jit [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00
triton [Inference/SpecDec] Add Speculative Decoding Implementation (#5423) 2024-04-10 11:07:52 +08:00
__init__.py [feat] refactored extension module (#5298) 2024-01-25 17:01:48 +08:00
extensions [feat] refactored extension module (#5298) 2024-01-25 17:01:48 +08:00
kernel_loader.py [Fix] resolve conflicts of merging main 2024-04-08 16:21:47 +08:00