Commit Graph

  • c4ad1f7a58 polish sharded model docstr ver217 2022-03-18 16:35:27 +0800
  • 95622aa248 doc number1roy 2022-03-18 16:32:22 +0800
  • af185b5519
    [test] fixed amp convergence comparison test (#454) Frank Lee 2022-03-18 16:28:16 +0800
  • a241f61b34
    [zero] Update initialize for ZeRO (#458) ver217 2022-03-18 16:18:31 +0800
  • 342407b046 doc number1roy 2022-03-18 16:15:00 +0800
  • 5f7f5e5a07 doc number1roy 2022-03-18 15:55:59 +0800
  • f591a2ba48 polish code ver217 2022-03-18 15:08:51 +0800
  • 82637b6ac6 update zero engine ver217 2022-03-18 15:01:45 +0800
  • 8a8c852402 shard strategy receive pg in shard() / gather() ver217 2022-03-18 14:36:31 +0800
  • 56c4c3bba2 polish code ver217 2022-03-18 14:27:55 +0800
  • 642846d6f9
    update sharded optim and fix zero init ctx (#457) ver217 2022-03-18 15:44:47 +0800
  • 573a1ae6ef doc number1roy 2022-03-18 15:30:56 +0800
  • 4f7dfb4443 update sharded optim and fix zero init ctx ver217 2022-03-18 13:17:53 +0800
  • e2e9f82588
    Revert "[zero] update sharded optim and fix zero init ctx" (#456) Jiarui Fang 2022-03-18 15:22:43 +0800
  • 5e8b17a857 Revert "update sharded optim and fix zero init ctx" Jiarui Fang 2022-03-18 15:11:44 +0800
  • f72edc4e6c Revert "remove surplus imports" Jiarui Fang 2022-03-18 15:11:44 +0800
  • acd2861e4b Revert "rename variables" Jiarui Fang 2022-03-18 15:11:44 +0800
  • 94096754ee Revert "polish code" Jiarui Fang 2022-03-18 15:11:44 +0800
  • 35877805cb [test] fixed amp convergence comparison test FrankLeeeee 2022-03-18 14:42:24 +0800
  • 8cf7ff08cf polish code ver217 2022-03-18 13:57:20 +0800
  • e99af94ab8 rename variables ver217 2022-03-18 13:34:58 +0800
  • 46add4a5c5 remove surplus imports ver217 2022-03-18 13:32:02 +0800
  • 57567ee768 update sharded optim and fix zero init ctx ver217 2022-03-18 13:17:53 +0800
  • b53763265e refactored communication creation in MOE 1SAA 2022-03-17 15:24:07 +0800
  • b0e2773d95 polish code ver217 2022-03-18 13:57:20 +0800
  • 3bf8530b3b rename variables ver217 2022-03-18 13:34:58 +0800
  • c8f94756d5 remove surplus imports ver217 2022-03-18 13:32:02 +0800
  • c9e5f6d852 update sharded optim and fix zero init ctx ver217 2022-03-18 13:17:53 +0800
  • f27d801a13
    [test] optimized zero data parallel test (#452) Frank Lee 2022-03-18 11:35:54 +0800
  • 1b75889a24 [test] optimized zero data parallel test FrankLeeeee 2022-03-18 10:59:55 +0800
  • cfcc8271f3
    [Bot] Automated submodule synchronization (#451) github-actions[bot] 2022-03-18 09:51:43 +0800
  • 92a23ff493 Automated submodule synchronization github-actions 2022-03-18 00:01:11 +0000
  • f23e038dc9 docs number1roy 2022-03-18 04:05:53 +0800
  • 45e0ac1cc3 doc number1roy 2022-03-18 03:39:04 +0800
  • 5acd3915a7 docs number1roy 2022-03-18 03:26:52 +0800
  • 9f82caff0b docs number1roy 2022-03-18 02:50:44 +0800
  • f3f397d6e9
    Merge branch 'hpcaitech:main' into main Liang Bowen 2022-03-18 02:35:21 +0800
  • 1a2b71e901 Merge branch 'main' of github.com:number1roy/ColossalAI number1roy 2022-03-18 02:31:51 +0800
  • e3f44384e6 docs number1roy 2022-03-18 02:22:19 +0800
  • ac4513c56e
    [DevOps] remove unneeded dependency in build workflow (#449) Frank Lee 2022-03-17 17:29:02 +0800
  • 0fcfb1e00d
    [test] make zero engine test really work (#447) Jiarui Fang 2022-03-17 17:24:25 +0800
  • f1ae7c635e update actions/checkout config FrankLeeeee 2022-03-17 16:31:53 +0800
  • bb2790cf0b
    optimize engine and trainer test (#448) Frank Lee 2022-03-17 15:44:17 +0800
  • cf14de4745 remove unneeded dependency in build workflow FrankLeeeee 2022-03-17 15:31:11 +0800
  • 15c4cd3b67 polish code jiaruifang 2022-03-17 15:22:28 +0800
  • b40f19a871 optimize engine and trainer test FrankLeeeee 2022-03-17 15:16:10 +0800
  • a9bb13dc7c
    Merge branch 'main' into jiaruifang/revisit_zero_engine_test Jiarui Fang 2022-03-17 15:18:02 +0800
  • ac69f5f819 polish code jiaruifang 2022-03-17 15:11:50 +0800
  • 237d08e7ee
    [zero] hybrid cpu adam (#445) Jiarui Fang 2022-03-17 15:05:41 +0800
  • b72b8445c6
    optimized context test time consumption (#446) Frank Lee 2022-03-17 14:40:52 +0800
  • 220f6844da optimized context test time consumption FrankLeeeee 2022-03-17 14:12:20 +0800
  • 154dd83af9 polish code jiaruifang 2022-03-17 13:57:23 +0800
  • f6d0b6aec3 hybrid adam jiaruifang 2022-03-17 13:32:41 +0800
  • 496cbb0760
    [hotfix] fix initialize bug with zero (#442) Jiarui Fang 2022-03-17 13:16:22 +0800
  • e093089dc3 polish jiaruifang 2022-03-17 12:54:40 +0800
  • 533c56a8f8 polish code jiaruifang 2022-03-17 12:21:22 +0800
  • 20294a5b48 Merge branch 'main' of github.com:hpcaitech/ColossalAI into jiaruifang/hotfix_zero jiaruifang 2022-03-17 12:19:29 +0800
  • 56fdcdf610
    Merge pull request #1 from number1roy/doc Liang Bowen 2022-03-17 12:18:47 +0800
  • 28e34869e0
    doc Liang Bowen 2022-03-17 12:18:20 +0800
  • 124d08d159 Merge branch 'main' of github.com:number1roy/ColossalAI conflict number1roy 2022-03-17 12:12:59 +0800
  • 2f36482464 conflict number1roy 2022-03-17 12:12:44 +0800
  • 40eaa4c95c polish jiaruifang 2022-03-17 12:07:54 +0800
  • f1dc538acd polish code jiaruifang 2022-03-17 12:07:14 +0800
  • 369a23de82 polish jiaruifang 2022-03-17 10:48:01 +0800
  • 725a39f4bd
    update github CI with the current workflow (#441) Frank Lee 2022-03-17 10:38:04 +0800
  • caf185c3a0 update github CI with the current workflow FrankLeeeee 2022-03-17 10:18:37 +0800
  • 0799febcab polish code jiaruifang 2022-03-17 10:36:48 +0800
  • 6db81e0382 Merge branch 'main' of github.com:hpcaitech/ColossalAI into jiaruifang/hotfix_zero jiaruifang 2022-03-17 10:28:09 +0800
  • 5a1e33b97f
    update contributing.md with the current workflow (#440) Frank Lee 2022-03-17 10:28:04 +0800
  • 7eda3e0ad2 test engine with zero and mp jiaruifang 2022-03-17 10:24:31 +0800
  • 17b8274f8a
    [unitest] polish zero config in unittest (#438) Jiarui Fang 2022-03-17 10:20:53 +0800
  • 25d2a88841 update contributing.md with the current workflow FrankLeeeee 2022-03-17 10:07:37 +0800
  • 9ba8202045 update contributing.md with the current workflow FrankLeeeee 2022-03-17 10:01:09 +0800
  • 1d15ba54d0 add missing file jiaruifang 2022-03-17 09:31:00 +0800
  • 27500c9a6b polish zero config in unittest jiaruifang 2022-03-17 09:22:10 +0800
  • e1682ee569 Merge branch 'main' of github.com:hpcaitech/ColossalAI into jiaruifang/hybrid_adam jiaruifang 2022-03-16 19:31:40 +0800
  • 640a6cd304
    [refactory] refactory the initialize method for new zero design (#431) Jiarui Fang 2022-03-16 19:29:37 +0800
  • 05a155dd10 polish code jiaruifang 2022-03-16 17:51:54 +0800
  • 4f85b687cf
    [misc] replace codebeat with codefactor on readme (#436) Frank Lee 2022-03-16 17:43:52 +0800
  • fa09bf12fb replace codebeat with codefactor on readme FrankLeeeee 2022-03-16 17:36:28 +0800
  • 45eaefa79d pass optimizer config jiaruifang 2022-03-16 17:24:09 +0800
  • bffd85bf34
    added testing module (#435) Frank Lee 2022-03-16 17:20:05 +0800
  • ff3ff81887 added testing module FrankLeeeee 2022-03-16 17:04:43 +0800
  • bdc5428062 polish code jiaruifang 2022-03-16 16:48:19 +0800
  • dbdc9a7783
    added Multiply Jitter and capacity factor eval for MOE (#434) HELSON 2022-03-16 16:47:44 +0800
  • c3ee600350 polish code jiaruifang 2022-03-16 16:30:32 +0800
  • 05990b6f55 added Multiply Jitter and capacity factor eval for MOE 1SAA 2022-03-16 16:24:07 +0800
  • b03b3ae99c
    fixed mem monitor device (#433) Frank Lee 2022-03-16 15:25:02 +0800
  • d04f16753f fixed mem monitor device FrankLeeeee 2022-03-16 15:10:55 +0800
  • 14a7094243
    fixed fp16 optimizer none grad bug (#432) Frank Lee 2022-03-16 14:35:46 +0800
  • 174fa52980 fixed fp16 optimizer none grad bug FrankLeeeee 2022-03-16 14:31:58 +0800
  • fa5d101b72 polsih code jiaruifang 2022-03-16 14:28:38 +0800
  • fce9432f08 sync before creating empty grad ver217 2022-03-16 13:40:19 +0800
  • ea6905a898 free param.grad ver217 2022-03-15 19:04:36 +0800
  • 9506a8beb2 use double buffer to handle grad ver217 2022-03-15 17:07:35 +0800
  • 0f5f5dd556
    fixed gpt attention mask in pipeline (#430) Frank Lee 2022-03-16 14:23:43 +0800
  • 0ff19a2dac [WIP] initialize for new zero jiaruifang 2022-03-16 14:21:46 +0800
  • 13d82ceed9 fixed gpt attention mask in pipeline FrankLeeeee 2022-03-16 14:20:13 +0800
  • 41a4b818fd sync before creating empty grad ver217 2022-03-16 13:40:19 +0800
  • f9c762df85
    [test] merge zero optim tests (#428) Jiarui Fang 2022-03-16 12:22:45 +0800