Frank Lee
|
40d376c566
|
[setup] support pre-build and jit-build of cuda kernels (#2374)
* [setup] support pre-build and jit-build of cuda kernels
* polish code
* polish code
* polish code
* polish code
* polish code
* polish code
|
2023-01-06 20:50:26 +08:00 |
Jiarui Fang
|
16cc8e6aa7
|
[builder] MOE builder (#2277)
|
2023-01-03 20:29:39 +08:00 |
zbian
|
e94c79f15b
|
improved allgather & reducescatter for 3d
|
2023-01-03 17:46:08 +08:00 |
アマデウス
|
622f863291
|
[hotfix] Jit type hint #2161 (#2164)
|
2022-12-22 10:17:03 +08:00 |
ver217
|
f8a7148dec
|
[kernel] move all symlinks of kernel to `colossalai._C` (#1971)
|
2022-11-17 13:42:33 +08:00 |
アマデウス
|
e52f9d9109
|
[tensorparallel] fixed tp layers (#1938)
|
2022-11-14 17:34:03 +08:00 |
Jiarui Fang
|
986f8cbaa7
|
[inference] overlap comm and compute in Linear1D_Row when stream_chunk_num > 1 (#1876)
|
2022-11-10 17:36:42 +08:00 |
Jiarui Fang
|
c2947dadf1
|
[inference] streaming Linear 1D Row inference (#1874)
|
2022-11-10 17:03:21 +08:00 |
zbian
|
653b0a620e
|
added skip_bias_add for non-tp linear
|
2022-11-09 15:41:08 +08:00 |
アマデウス
|
4268ae017b
|
[kernel] added jit warmup (#1792)
|
2022-11-08 16:22:23 +08:00 |
kurisusnowdeng
|
0b8161fab8
|
updated tp layers
|
2022-11-02 12:19:38 +08:00 |
HELSON
|
a088022efc
|
[moe] fix moe bugs (#1633)
|
2022-09-23 15:33:57 +08:00 |
HELSON
|
f7f2248771
|
[moe] fix MoE bugs (#1628)
* remove forced FP32 modules
* correct no_shard-contexts' positions
|
2022-09-22 13:56:30 +08:00 |
DouJS
|
f586887a90
|
[NFC] polish colossalai/nn/layer/colossalai_layer/dropout.py code style (#1568)
|
2022-09-08 22:11:04 +08:00 |
Ofey Chan
|
7cc052f6c0
|
[NFC] polish colossalai/nn/layer/colossalai_layer/linear.py (#1556)
|
2022-09-08 22:11:04 +08:00 |
ver217
|
10dd8226b1
|
add gather_output for VocabParallelClassifier1D (#1569)
|
2022-09-08 16:40:56 +08:00 |
ver217
|
ae71036cd2
|
[utils] refactor parallel layers checkpoint and bcast model on loading checkpoint (#1548)
* refactor parallel layer
* broadcast rank0 model after load ckpt
|
2022-09-06 20:18:35 +08:00 |
runluo
|
f83c4d6597
|
[NFC] polish colossalai/nn/layer/wrapper/pipeline_wrapper.py code style (#1303)
|
2022-07-13 19:01:07 +08:00 |
XYE
|
e83b2ce853
|
[NFC] polish colossalai/nn/layer/vanilla/layers.py code style (#1295)
|
2022-07-13 12:08:21 +08:00 |
Liping233
|
1000a41fd5
|
[NFC] polish colossalai/nn/layer/vanilla/__init__.py code style (#1293)
|
2022-07-13 12:08:21 +08:00 |
Wangbo Zhao(黑色枷锁)
|
552667825b
|
[NFC] polish colossalai/nn/layer/parallel_1d/layers.py code style (#1290)
|
2022-07-13 12:08:21 +08:00 |
Jiatong Han
|
38e3ccd1e9
|
[NFC] polish colossalai/nn/layer/parallel_sequence/layers.py code style (#1280)
Co-authored-by: JThh <jiatong.han@u.nus.edu>
|
2022-07-13 12:08:21 +08:00 |
Geng Zhang
|
0e06f62160
|
[NFC] polish colossalai/nn/layer/parallel_sequence/_operation.py code style (#1266)
|
2022-07-13 12:08:21 +08:00 |
superhao1995
|
f660152c73
|
[NFC] polish colossalai/nn/layer/parallel_3d/_operation.py code style (#1258)
Co-authored-by: Research <research@soccf-snr3-017.comp.nus.edu.sg>
|
2022-07-13 12:08:21 +08:00 |
Frank Lee
|
2b2dc1c86b
|
[pipeline] refactor the pipeline module (#1087)
* [pipeline] refactor the pipeline module
* polish code
|
2022-06-10 11:27:38 +08:00 |
Ziyue Jiang
|
0653c63eaa
|
[Tensor] 1d row embedding (#1075)
* Add CPU 1d row embedding
* polish
|
2022-06-08 12:04:59 +08:00 |
Ziheng Qin
|
571f12eff3
|
[NFC] polish colossalai/nn/layer/utils/common.py code style (#983)
|
2022-05-17 10:25:06 +08:00 |
shenggan
|
18542b47fc
|
[NFC] polish colossalai/nn/layer/parallel_2d/layers.py code style (#976)
|
2022-05-17 10:25:06 +08:00 |
Zirui Zhu
|
598cde4a0f
|
[NFC] polish colossalai/nn/layer/parallel_2p5d/layers.py code style (#972)
|
2022-05-17 10:25:06 +08:00 |
LuGY
|
fb5bc6cb28
|
[NFC] polish colossalai/nn/layer/parallel_3d/layers.py code style (#966)
|
2022-05-17 10:25:06 +08:00 |
ver217
|
58580b50fe
|
Revert "[NFC] Hotfix/format (#984)" (#986)
This reverts commit 0772828fba .
|
2022-05-17 10:23:38 +08:00 |
binmakeswell
|
0772828fba
|
[NFC] Hotfix/format (#984)
* [NFC] Polish colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.cu code style. (#937)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/cuda_util.h code style (#939)
* [NFC] polish colossalai/kernel/cuda_native/csrc/cpu_adam.cpp code style (#936)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/block_reduce.h code style (#938)
* [NFC] polish moe_cuda_kernel.cu code style (#940)
Co-authored-by: Xiao Ye <xiaoye2@illinois.edu>
* [NFC] polish pre-commit run --files colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax_cuda.cu code style (#943)
* [NFC] polish colossalai/kernel/cuda_native/csrc/moe_cuda.cpp code style (#942)
* [NFC] polish colossalai/kernel/cuda_native/csrc/cpu_adam.h code style (#945)
* [NFC] polish colossalai/kernel/jit/bias_gelu.py code style (#946)
Co-authored-by: jnbai <897086360@qq.com>
* [NFC] polish colossalai/kernel/cuda_native/csrc/scaled_masked_softmax_cuda.cu code style (#949)
Co-authored-by: Jiatong <jiatong.han@u.nus.edu>
* [NFC] polish colossalai/builder/pipeline.py code style (#951)
* [NFC] polish colossalai/kernel/cuda_native/csrc/multihead_attention_1d.cpp code style (#952)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/cross_entropy.cu code style (#953)
Co-authored-by: 何晓昕 <cautious@hexiaoxins-MacBook-Pro.local>
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/softmax_kernels.cu code style (#954)
* [NFC] polish colossalai/kernel/cuda_native/scaled_softmax.py code style (#955)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/context.h code style (#956)
Co-authored-by: RichardoLuo <14049555596@qq.com>
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/cross_entropy_layer.h code style (#957)
* [NFC] polish colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.cu code style (#958)
* [NFC] polish colossalai/kernel/cuda_native/csrc/multihead_attention_1d.h code style (#962)
* [NFC] polish colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.cpp code style (#959)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/general_kernels.cu code style (#963)
Co-authored-by: “Arsmart123 <202476410arsmart@gmail.com>
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/softmax.h code style (#964)
* [NFC] polish __init__.py code style (#965)
* [NFC] polish colossalai/nn/layer/parallel_3d/layers.py code style (#966)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/feed_forward.h (#968)
code style
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/include/dropout.h code style (#970)
* [NFC] polish colossalai/nn/layer/parallel_2p5d/layers.py code style (#972)
* [NFC] polish colossalai/kernel/cuda_native/csrc/layer_norm_cuda.cpp code style (#973)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/normalize_kernels.cu code style (#974)
* [NFC] polish colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.cu code style (#977)
* [NFC] polish colossalai/nn/layer/parallel_2d/layers.py code style (#976)
* [NFC] polish colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.cu code style (#978)
* [NFC] polish colossalai/kernel/cuda_native/csrc/kernels/dropout_kernels.cu code style (#979)
* [NFC] polish colossalai/kernel/cuda_native/layer_norm.py code style (#980)
* [NFC] polish colossalai/nn/layer/utils/common.py code style (#983)
Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com>
Co-authored-by: yuxuan-lou <83441848+yuxuan-lou@users.noreply.github.com>
Co-authored-by: Geng Zhang <34452939+zxgx@users.noreply.github.com>
Co-authored-by: Maruyama_Aya <38985202+MaruyamaAya@users.noreply.github.com>
Co-authored-by: XYE <92607131+Itok2000u@users.noreply.github.com>
Co-authored-by: Xiao Ye <xiaoye2@illinois.edu>
Co-authored-by: HaoyuQin <79465534+coder-chin@users.noreply.github.com>
Co-authored-by: wky <64853922+wangkuangyi@users.noreply.github.com>
Co-authored-by: bajiaoyu517 <59548007+bajiaoyu517@users.noreply.github.com>
Co-authored-by: luoling-LC <105470086+luoling-LC@users.noreply.github.com>
Co-authored-by: jnbai <897086360@qq.com>
Co-authored-by: JT.Han <59948448+JThh@users.noreply.github.com>
Co-authored-by: Jiatong <jiatong.han@u.nus.edu>
Co-authored-by: xyupeng <99191637+xyupeng@users.noreply.github.com>
Co-authored-by: Sze-qq <68757353+Sze-qq@users.noreply.github.com>
Co-authored-by: Cautiousss <48676630+Cautiousss@users.noreply.github.com>
Co-authored-by: 何晓昕 <cautious@hexiaoxins-MacBook-Pro.local>
Co-authored-by: Luxios22 <67457897+Luxios22@users.noreply.github.com>
Co-authored-by: Wangbo Zhao(黑色枷锁) <56866854+wangbo-zhao@users.noreply.github.com>
Co-authored-by: RichardoLuo <50363844+RichardoLuo@users.noreply.github.com>
Co-authored-by: RichardoLuo <14049555596@qq.com>
Co-authored-by: doubleHU <98150031+huxin711@users.noreply.github.com>
Co-authored-by: runluo <68489000+run-qiao@users.noreply.github.com>
Co-authored-by: MaxT <854721132@qq.com>
Co-authored-by: superhao1995 <804673818@qq.com>
Co-authored-by: ziyu huang <huang0ziyu@gmail.com>
Co-authored-by: “Arsmart123 <202476410arsmart@gmail.com>
Co-authored-by: Yuer867 <62204893+Yuer867@users.noreply.github.com>
Co-authored-by: lucasliunju <lucasliunju@gmail.com>
Co-authored-by: LuGY <74758262+Gy-Lu@users.noreply.github.com>
Co-authored-by: ExtremeViscent <zhangyiqi55732@sina.com>
Co-authored-by: Xu Kai <xukai16@foxmail.com>
Co-authored-by: Zirui Zhu <zhuzr21@gmail.com>
Co-authored-by: Ofey Chan <ofey206@gmail.com>
Co-authored-by: DouJS <dujiangsu@163.com>
Co-authored-by: Jie Zhu <chore.08-protist@icloud.com>
Co-authored-by: shenggan <csg19971016@gmail.com>
Co-authored-by: Kai Wang (Victor Kai) <37533040+kaiwang960112@users.noreply.github.com>
Co-authored-by: puck_WCR <46049915+WANG-CR@users.noreply.github.com>
Co-authored-by: Ziheng Qin <37519855+henryqin1997@users.noreply.github.com>
|
2022-05-17 09:54:49 +08:00 |
HELSON
|
e5ea3fdeef
|
[gemini] add GeminiMemoryManger (#832)
* refactor StatefulTensor, tensor utilities
* add unitest for GeminiMemoryManager
|
2022-04-24 13:08:48 +08:00 |
Ziyue Jiang
|
4b01da24cd
|
[TP] change the check assert in split batch 2d (#772)
|
2022-04-16 21:29:57 +08:00 |
アマデウス
|
b8899e0905
|
[TP] allow layernorm without bias (#750)
|
2022-04-14 11:43:56 +08:00 |
Frank Lee
|
eda30a058e
|
[compatibility] fixed tensor parallel compatibility with torch 1.9 (#700)
|
2022-04-11 13:44:50 +08:00 |
HELSON
|
a9b8300d54
|
[zero] improve adaptability for not-shard parameters (#708)
* adapt post grad hooks for not-shard parameters
* adapt optimizer for not-shard parameters
* offload gradients for not-replicated parameters
|
2022-04-11 13:38:51 +08:00 |
アマデウス
|
3fc8a204dc
|
[]Corrected 3d vocab parallel embedding (#707)
|
2022-04-11 10:17:55 +08:00 |
Liang Bowen
|
828e465622
|
[hotfix] Raise messages for indivisible batch sizes with tensor parallelism (#622)
|
2022-04-02 16:12:04 +08:00 |
アマデウス
|
77ad24bf94
|
[model checkpoint] updated saving/loading for 3d layers (#597)
|
2022-04-01 16:52:47 +08:00 |
アマデウス
|
93089ed708
|
[model checkpoint] updated saving/loading for 2.5d layers (#596)
|
2022-04-01 16:52:33 +08:00 |
アマデウス
|
c50bfb807b
|
[model checkpoint] updated saving/loading for 1d layers (#594)
|
2022-04-01 16:51:52 +08:00 |
アマデウス
|
7636d518e1
|
[model checkpoint] updated saving/loading for 2d layers (#595)
|
2022-04-01 16:50:34 +08:00 |
アマデウス
|
cd13b63832
|
[model checkpoint] reworked unified layers for ease of save/load states (#593)
|
2022-04-01 16:49:56 +08:00 |
Ziyue Jiang
|
1c40ee8749
|
[TP] add assert for tp1d (#621)
|
2022-04-01 16:44:23 +08:00 |
ver217
|
8432dc7080
|
polish moe docsrting (#618)
|
2022-04-01 16:15:36 +08:00 |
HELSON
|
e6d50ec107
|
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero
* add unitest for moe-zero model init
* polish moe gradient handler
|
2022-03-31 18:34:11 +08:00 |
Wesley
|
46c9ba33da
|
update code format
|
2022-03-31 17:15:08 +08:00 |
Wesley
|
666cfd094a
|
fix parallel_input flag for Linear1D_Col gather_output
|
2022-03-31 17:15:08 +08:00 |
Liang Bowen
|
2c45efc398
|
html refactor (#555)
|
2022-03-31 11:36:56 +08:00 |