Commit Graph

9 Commits (354b7954d1fa6b21a5ba566f0d5ec099280ad315)

Author SHA1 Message Date
LuGY 6a3f9fda83
[cuda] modify the fused adam, support hybrid of fp16 and fp32 (#497) 2022-03-25 14:15:53 +08:00
ExtremeViscent eaac03ae1d [formart] format fixed for kernel\cuda_native codes (#335) 2022-03-11 15:50:28 +08:00
LuGY a3269de5c9 [zero] cpu adam kernel (#288)
* Added CPU Adam

* finished the cpu adam

* updated the license

* delete useless parameters, removed resnet

* modified the method off cpu adam unittest

* deleted some useless codes

* removed useless codes

Co-authored-by: ver217 <lhx0217@gmail.com>
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
Co-authored-by: jiaruifang <fangjiarui123@gmail.com>
2022-03-11 15:50:28 +08:00
1SAA 219df6e685 Optimized MoE layer and fixed some bugs;
Decreased moe tests;

Added FFNExperts and ViTMoE model
2022-03-11 15:50:28 +08:00
アマデウス 9ee197d0e9 moved env variables to global variables; (#215)
added branch context;
added vocab parallel layers;
moved split_batch from load_batch to tensor parallel embedding layers;
updated gpt model;
updated unit test cases;
fixed few collective communicator bugs
2022-02-15 11:31:13 +08:00
HELSON 0f8c7f9804
Fixed docstring in colossalai (#171) 2022-01-21 10:44:30 +08:00
Frank Lee e2089c5c15
adapted for sequence parallel (#163) 2022-01-20 13:44:51 +08:00
ver217 f68eddfb3d
refactor kernel (#142) 2022-01-13 16:47:17 +08:00
shenggan 5c3843dc98
add colossalai kernel module (#55) 2021-12-21 12:19:52 +08:00