ColossalAI/colossalai/kernel/cuda_native/csrc
Hongxin Liu ae02d4e4f7
[bf16] add bf16 support (#3882)
* [bf16] add bf16 support for fused adam (#3844)

* [bf16] fused adam kernel support bf16

* [test] update fused adam kernel test

* [test] update fused adam test

* [bf16] cpu adam and hybrid adam optimizers support bf16 (#3860)

* [bf16] implement mixed precision mixin and add bf16 support for low level zero (#3869)

* [bf16] add mixed precision mixin

* [bf16] low level zero optim support bf16

* [text] update low level zero test

* [text] fix low level zero grad acc test

* [bf16] add bf16 support for gemini (#3872)

* [bf16] gemini support bf16

* [test] update gemini bf16 test

* [doc] update gemini docstring

* [bf16] add bf16 support for plugins (#3877)

* [bf16] add bf16 support for legacy zero (#3879)

* [zero] init context support bf16

* [zero] legacy zero support bf16

* [test] add zero bf16 test

* [doc] add bf16 related docstring for legacy zero
2023-06-05 15:58:31 +08:00
..
kernels [doc] add deepspeed citation and copyright (#2996) 2023-03-04 20:08:11 +08:00
colossal_C_frontend.cpp [optimizer] add div_scale for optimizers (#2117) 2022-12-12 17:58:57 +08:00
compat.h
cpu_adam.cpp
cpu_adam.h
layer_norm_cuda.cpp
layer_norm_cuda_kernel.cu
moe_cuda.cpp
moe_cuda_kernel.cu
multi_tensor_adam.cu [doc] add deepspeed citation and copyright (#2996) 2023-03-04 20:08:11 +08:00
multi_tensor_apply.cuh [doc] add deepspeed citation and copyright (#2996) 2023-03-04 20:08:11 +08:00
multi_tensor_l2norm_kernel.cu
multi_tensor_lamb.cu
multi_tensor_scale_kernel.cu
multi_tensor_sgd_kernel.cu
multihead_attention_1d.cpp [hotfix] fix error for torch 2.0 (#2243) 2022-12-30 23:11:55 +08:00
multihead_attention_1d.h [hotfix] fix error for torch 2.0 (#2243) 2022-12-30 23:11:55 +08:00
scaled_masked_softmax.cpp
scaled_masked_softmax.h
scaled_masked_softmax_cuda.cu
scaled_upper_triang_masked_softmax.cpp
scaled_upper_triang_masked_softmax.h
scaled_upper_triang_masked_softmax_cuda.cu
type_shim.h [bf16] add bf16 support (#3882) 2023-06-05 15:58:31 +08:00