ColossalAI

History

Yuanheng Zhao d85d91435a [Inference/SpecDec] Support GLIDE Drafter Model (#5455 ) * add glide-llama policy and modeling * update glide modeling, compitable with transformers 4.36.2 * revise glide llama modeling/usage * fix issues of glimpsing large kv * revise the way re-loading params for glide drafter * fix drafter and engine tests * enable convert to glide strict=False * revise glide llama modeling * revise vicuna prompt template * revise drafter and tests * apply usage of glide model in engine		2024-04-10 11:07:52 +08:00
..
kit	[devops] remove post commit ci (#5566 )	2024-04-08 15:09:40 +08:00
test_analyzer	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_auto_parallel	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00
test_autochunk	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_booster	[shardformer] fix pipeline forward error if custom layer distribution is used (#5189 )	2024-03-27 13:57:00 +08:00
test_checkpoint_io	[devops] remove post commit ci (#5566 )	2024-04-08 15:09:40 +08:00
test_cluster	[shardformer] Sequence Parallelism Optimization (#5533 )	2024-04-03 17:15:47 +08:00
test_config	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_device	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_fx	[misc] update pre-commit and run all files (#4752 )	2023-09-19 14:20:26 +08:00
test_gptq	[devops] remove post commit ci (#5566 )	2024-04-08 15:09:40 +08:00
test_infer	[Inference/SpecDec] Support GLIDE Drafter Model (#5455 )	2024-04-10 11:07:52 +08:00
test_lazy	[devops] remove post commit ci (#5566 )	2024-04-08 15:09:40 +08:00
test_legacy	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00
test_moe	[hotfix] set return_outputs=False in examples and polish code (#5404 )	2024-03-25 12:31:09 +08:00
test_optimizer	[devops] remove post commit ci (#5566 )	2024-04-08 15:09:40 +08:00
test_pipeline	[devops] remove post commit ci (#5566 )	2024-04-08 15:09:40 +08:00
test_shardformer	[shardformer] Sequence Parallelism Optimization (#5533 )	2024-04-03 17:15:47 +08:00
test_smoothquant	[inference] Add smmoothquant for llama (#4904 )	2023-10-16 11:28:44 +08:00
test_tensor	fixed layout converter caching and updated tester	2024-03-26 17:22:27 +08:00
test_zero	[npu] change device to accelerator api (#5239 )	2024-01-09 10:20:05 +08:00
__init__.py	[zero] Update sharded model v2 using sharded param v2 (#323 )	2022-03-11 15:50:28 +08:00