LuGY
2883040286
[example] change qkv processing ( #870 )
2022-04-26 13:33:27 +08:00
LuGY
13ed4b6441
[model zoo] add activation offload for gpt model ( #582 )
2022-03-31 17:42:20 +08:00
ver217
d70f43dd7a
embedding remove attn mask ( #474 )
2022-03-21 14:53:23 +08:00
ver217
1559c0df41
fix attn mask shape of gpt ( #472 )
2022-03-21 12:01:31 +08:00
ver217
304263c2ce
fix gpt attention mask ( #461 )
2022-03-18 17:24:19 +08:00
Frank Lee
0f5f5dd556
fixed gpt attention mask in pipeline ( #430 )
2022-03-16 14:23:43 +08:00
アマデウス
9ee197d0e9
moved env variables to global variables; ( #215 )
...
added branch context;
added vocab parallel layers;
moved split_batch from load_batch to tensor parallel embedding layers;
updated gpt model;
updated unit test cases;
fixed few collective communicator bugs
2022-02-15 11:31:13 +08:00
ver217
7904baf6e1
fix layers/schedule for hybrid parallelization ( #111 ) ( #112 )
2022-01-04 20:52:31 +08:00
アマデウス
e5b9f9a08d
added gpt model & benchmark ( #95 )
2021-12-30 14:43:30 +08:00