ColossalAI

Commit Graph

Author	SHA1	Message	Date
digger yu	8abc87798f	fix Tensor is not defined (#4129 )	1 year ago
Hongxin Liu	ae02d4e4f7	[bf16] add bf16 support (#3882 ) * [bf16] add bf16 support for fused adam (#3844) * [bf16] fused adam kernel support bf16 * [test] update fused adam kernel test * [test] update fused adam test * [bf16] cpu adam and hybrid adam optimizers support bf16 (#3860) * [bf16] implement mixed precision mixin and add bf16 support for low level zero (#3869) * [bf16] add mixed precision mixin * [bf16] low level zero optim support bf16 * [text] update low level zero test * [text] fix low level zero grad acc test * [bf16] add bf16 support for gemini (#3872) * [bf16] gemini support bf16 * [test] update gemini bf16 test * [doc] update gemini docstring * [bf16] add bf16 support for plugins (#3877) * [bf16] add bf16 support for legacy zero (#3879) * [zero] init context support bf16 * [zero] legacy zero support bf16 * [test] add zero bf16 test * [doc] add bf16 related docstring for legacy zero	2 years ago
digger yu	70c8cdecf4	[nfc] fix typo colossalai/cli fx kernel (#3847 ) * fix typo colossalai/autochunk auto_parallel amp * fix typo colossalai/auto_parallel nn utils etc. * fix typo colossalai/auto_parallel autochunk fx/passes etc. * fix typo docs/ * change placememt_policy to placement_policy in docs/ and examples/ * fix typo colossalai/ applications/ * fix typo colossalai/cli fx kernel	2 years ago
digger-yu	b9a8dff7e5	[doc] Fix typo under colossalai and doc(#3618 ) * Fixed several spelling errors under colossalai * Fix the spelling error in colossalai and docs directory * Cautious Changed the spelling error under the example folder * Update runtime_preparation_pass.py revert autograft to autograd * Update search_chunk.py utile to until * Update check_installation.py change misteach to mismatch in line 91 * Update 1D_tensor_parallel.md revert to perceptron * Update 2D_tensor_parallel.md revert to perceptron in line 73 * Update 2p5D_tensor_parallel.md revert to perceptron in line 71 * Update 3D_tensor_parallel.md revert to perceptron in line 80 * Update README.md revert to resnet in line 42 * Update reorder_graph.py revert to indice in line 7 * Update p2p.py revert to megatron in line 94 * Update initialize.py revert to torchrun in line 198 * Update routers.py change to detailed in line 63 * Update routers.py change to detailed in line 146 * Update README.md revert random number in line 402	2 years ago
zbian	7bc0afc901	updated flash attention usage	2 years ago
Frank Lee	95a36eae63	[kernel] added kernel loader to softmax autograd function (#3093 ) * [kernel] added kernel loader to softmax autograd function * [release] v0.2.6	2 years ago
ver217	823f3b9cf4	[doc] add deepspeed citation and copyright (#2996 ) * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright	2 years ago
ver217	090f14fd6b	[misc] add reference (#2930 ) * [misc] add reference * [misc] add license	2 years ago
Frank Lee	918bc94b6b	[triton] added copyright information for flash attention (#2835 ) * [triton] added copyright information for flash attention * polish code	2 years ago
Frank Lee	dd14783f75	[kernel] fixed repeated loading of kernels (#2549 ) * [kernel] fixed repeated loading of kernels * polish code * polish code	2 years ago
Frank Lee	8b7495dd54	[example] integrate seq-parallel tutorial with CI (#2463 )	2 years ago
jiaruifang	69d9180c4b	[hotfix] issue #2388	2 years ago
Frank Lee	40d376c566	[setup] support pre-build and jit-build of cuda kernels (#2374 ) * [setup] support pre-build and jit-build of cuda kernels * polish code * polish code * polish code * polish code * polish code * polish code	2 years ago
Jiarui Fang	db6eea3583	[builder] reconfig op_builder for pypi install (#2314 )	2 years ago
Jiarui Fang	16cc8e6aa7	[builder] MOE builder (#2277 )	2 years ago
xcnick	85178a397a	[hotfix] fix error for torch 2.0 (#2243 )	2 years ago
Jiarui Fang	db4cbdc7fb	[builder] builder for scaled_upper_triang_masked_softmax (#2234 )	2 years ago
Jiarui Fang	54de05da5d	[builder] polish builder with better base class (#2216 ) * [builder] polish builder * remove print	2 years ago
Jiarui Fang	7675792100	[builder] raise Error when CUDA_HOME is not set (#2213 )	2 years ago
Jiarui Fang	1cb532ffec	[builder] multihead attn runtime building (#2203 ) * [hotfix] correcnt cpu_optim runtime compilation * [builder] multihead attn * fix bug * fix a bug	2 years ago
Jiarui Fang	5682e6d346	[hotfix] correcnt cpu_optim runtime compilation (#2197 )	2 years ago
Jiarui Fang	355ffb386e	[builder] unified cpu_optim fused_optim inferface (#2190 )	2 years ago
Jiarui Fang	bc0e271e71	[buider] use builder() for cpu adam and fused optim in setup.py (#2187 )	2 years ago
Jiarui Fang	d42afd30f8	[builder] runtime adam and fused_optim builder (#2184 )	2 years ago
アマデウス	077a66dd81	updated attention kernel (#2133 )	2 years ago
HELSON	e7d3afc9cc	[optimizer] add div_scale for optimizers (#2117 ) * [optimizer] add div_scale for optimizers * [zero] use div_scale in zero optimizer * fix testing error	2 years ago
ver217	f8a7148dec	[kernel] move all symlinks of kernel to `colossalai._C` (#1971 )	2 years ago
zbian	6877121377	updated flash attention api	2 years ago
アマデウス	4268ae017b	[kernel] added jit warmup (#1792 )	2 years ago
xcnick	e0da01ea71	[hotfix] fix build error when torch version >= 1.13 (#1803 )	2 years ago
oahzxl	9639ea88fc	[kernel] more flexible flashatt interface (#1804 )	2 years ago
oahzxl	501a9e9cd2	[hotfix] polish flash attention (#1802 )	2 years ago
Jiarui Fang	c248800359	[kernel] skip tests of flash_attn and triton when they are not available (#1798 )	2 years ago
oahzxl	25952b67d7	[feat] add flash attention (#1762 )	2 years ago
ver217	12b4887097	[hotfix] fix CPUAdam kernel nullptr (#1410 )	2 years ago
binmakeswell	7696cead8d	Recover kernal files	2 years ago
Maruyama_Aya	87f679aeae	[NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/kernels.h code style (#1291 )	2 years ago
doubleHU	d6f5ef8860	[NFC] polish colossalai/kernel/cuda_native/csrc/kernels/transform_kernels.cu code style (#1286 )	2 years ago
yuxuan-lou	5f6ab35d25	Hotfix/format (#1274 ) * [NFC] Polish colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.cu code style. (#937) * [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/cuda_util.h code style * [NFC] polish colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.cpp code style Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com>	2 years ago
binmakeswell	c95e18cdb9	[NFC] polish colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.h code style (#1270 )	2 years ago
DouJS	db13f96333	[NFC] polish colossalai/kernel/cuda_native/csrc/multi_tensor_apply.cuh code style (#1264 )	2 years ago
shenggan	5d7366b144	[NFC] polish colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.h code style (#1263 )	2 years ago
ziyu huang	f1cafcc73a	[NFC] polish colossalai/kernel/cuda_native/csrc/kernels/dropout_kernels.cu code style (#1261 ) Co-authored-by: “Arsmart123 <202476410arsmart@gmail.com>	2 years ago
Sze-qq	f8b9aaef47	[NFC] polish colossalai/kernel/cuda_native/csrc/type_shim.h code style (#1260 )	2 years ago
ver217	e4f555f29a	[optim] refactor fused sgd (#1134 )	2 years ago
zhengzangw	ae7c338105	[NFC] polish colossalai/kernel/cuda_native/csrc/colossal_C_frontend.cpp code style	3 years ago
Frank Lee	533d0c46d8	[kernel] fixed the include bug in dropout kernel (#999 )	3 years ago
puck_WCR	bda70b4b66	[NFC] polish colossalai/kernel/cuda_native/layer_norm.py code style (#980 )	3 years ago
Kai Wang (Victor Kai)	c50c08dcbb	[NFC] polish colossalai/kernel/cuda_native/csrc/kernels/dropout_kernels.cu code style (#979 )	3 years ago
binmakeswell	f28c021376	[NFC] polish colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.cu code style (#978 )	3 years ago

1 2 3

114 Commits (74257cb4461702deb357eb1b599788993a7757ad)