mirror of https://github.com/hpcaitech/ColossalAI
![]() * add rotary embedding kernel * add rotary_embedding_kernel * add fused rotary_emb and kvcache memcopy * add fused_rotary_emb_and_cache_kernel.cu * add fused_rotary_emb_and_memcopy * fix bugs in fused_rotary_emb_and_cache_kernel.cu * fix ci bugs * use vec memcopy and opt the gloabl memory access * fix code style * fix test_rotary_embdding_unpad.py * codes revised based on the review comments * fix bugs about include path * rm inline |
||
---|---|---|
.. | ||
include | ||
pybind | ||
utils | ||
activation_kernel.cu | ||
decode_kv_cache_memcpy_kernel.cu | ||
fused_rotary_emb_and_cache_kernel.cu | ||
layer_norm_kernel.cu | ||
moe_kernel.cu | ||
multi_tensor_adam_kernel.cu | ||
multi_tensor_apply.cuh | ||
multi_tensor_l2norm_kernel.cu | ||
multi_tensor_lamb_kernel.cu | ||
multi_tensor_scale_kernel.cu | ||
multi_tensor_sgd_kernel.cu | ||
rms_layernorm_kernel.cu | ||
scaled_masked_softmax.h | ||
scaled_masked_softmax_kernel.cu | ||
scaled_upper_triang_masked_softmax.h | ||
scaled_upper_triang_masked_softmax_kernel.cu |