ColossalAI/tests
Cuiqing Li 459a88c806
[Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding for llama token attention (#4965)
* adding flash-decoding

* clean

* adding kernel

* adding flash-decoding

* add integration

* add

* adding kernel

* adding kernel

* adding triton 2.1.0 features for inference

* update bloom triton kernel

* remove useless vllm kernels

* clean codes

* fix

* adding files

* fix readme

* update llama flash-decoding

---------

Co-authored-by: cuiqing.li <lixx336@gmail.com>
2023-10-30 14:04:37 +08:00
..
kit [Inference] Dynamic Batching Inference, online and offline (#4953) 2023-10-30 10:52:19 +08:00
test_analyzer [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_auto_parallel [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_autochunk [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_booster [gemini] support gradient accumulation (#4869) 2023-10-17 14:07:21 +08:00
test_checkpoint_io [hotfix] fix lr scheduler bug in torch 2.0 (#4864) 2023-10-12 14:04:24 +08:00
test_cluster [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_config [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_device [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_fx [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_gptq [feature] add gptq for inference (#4754) 2023-09-22 11:02:50 +08:00
test_infer [Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding for llama token attention (#4965) 2023-10-30 14:04:37 +08:00
test_infer_ops/triton [Kernels]Updated Triton kernels into 2.1.0 and adding flash-decoding for llama token attention (#4965) 2023-10-30 14:04:37 +08:00
test_lazy [lazy] support from_pretrained (#4801) 2023-09-26 11:04:11 +08:00
test_legacy [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
test_moe [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_optimizer [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
test_pipeline [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_shardformer [hotfix] fix torch 2.0 compatibility (#4936) 2023-10-18 11:05:25 +08:00
test_smoothquant [inference] Add smmoothquant for llama (#4904) 2023-10-16 11:28:44 +08:00
test_tensor [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_utils [misc] update pre-commit and run all files (#4752) 2023-09-19 14:20:26 +08:00
test_zero [test] merge old components to test to model zoo (#4945) 2023-10-20 10:35:08 +08:00
__init__.py [zero] Update sharded model v2 using sharded param v2 (#323) 2022-03-11 15:50:28 +08:00