ColossalAI/colossalai/inference/modeling/models
Runyu Lu 74c47921fa
[Fix] Llama3 Load/Omit CheckpointIO Temporarily (#5717)
* Fix Llama3 Load error
* Omit Checkpoint IO Temporarily
2024-05-14 20:17:43 +08:00
..
__init__.py fix bugs in request_handler 2024-01-11 13:39:56 +00:00
glide_llama.py [Inference/SpecDec] Support GLIDE Drafter Model (#5455) 2024-04-10 11:07:52 +08:00
nopadding_baichuan.py add paged-attetionv2: support seq length split across thread block (#5707) 2024-05-14 12:46:54 +08:00
nopadding_llama.py [Fix] Llama3 Load/Omit CheckpointIO Temporarily (#5717) 2024-05-14 20:17:43 +08:00