ColossalAI/tests/test_infer_ops/triton
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724)
* add chatglm2

* add

* gather needed kernels

* fix some bugs

* finish context forward

* finish context stage

* fix

* add

* pause

* add

* fix bugs

* finish chatglm

* fix bug

* change some logic

* fix bugs

* change some logics

* add

* add

* add

* fix

* fix tests

* fix
2023-09-22 11:12:50 +08:00
..
kernel_utils.py
test_bloom_context_attention.py
test_copy_kv_dest.py
test_layernorm_triton.py
test_llama2_token_attn.py [inference] chatglm2 infer demo (#4724) 2023-09-22 11:12:50 +08:00
test_llama_context_attention.py
test_rotary_embedding.py
test_self_attention_nonfusion.py
test_softmax.py
test_token_attn_1.py
test_token_attn_2.py
test_token_attn_fwd.py
test_token_softmax.py