Commit Graph

  • 91c327cb44
    fixed zero level 3 dtype bug (#76) Frank Lee 2021-12-20 17:00:53 +0800
  • e23d683a63 add interleaved pipeline, fix naive amp and update pipeline model initializer ver217 2021-12-14 16:44:59 +0800
  • 214f0317ee Updated 2.5d code BoxiangW 2021-12-20 12:23:16 +0800
  • c0c2f2cec3
    ColossalAI Example MNIST Image Classifier Ambrish Kashyap 2021-12-20 08:12:39 +0530
  • d6e0b5424b
    Merge branch 'feature/pipeline' into feature/pipeline ver217 2021-12-17 13:07:18 +0800
  • c74620de86 fix interleaved pipeline naive amp ver217 2021-12-17 10:17:43 +0800
  • c3feecc327 update pipeline model initializer ver217 2021-12-16 17:15:32 +0800
  • 491eff5831
    Add files via upload ShanMait 2021-12-16 23:12:55 +0530
  • 4ba5ef5437
    Add files via upload ShanMait 2021-12-16 23:10:32 +0530
  • eaf2517c1e detail README & standardize dataset path xinzhang 2021-12-16 13:29:58 +0000
  • 0079d0ccae
    Merge branch 'hpcaitech:main' into taskv2-branch Xin Zhang 2021-12-16 21:25:32 +0800
  • 32cd8c9572
    update pipeline model initializer (#77) ver217 2021-12-16 17:33:30 +0800
  • 1a992c8a5b update pipeline model initializer ver217 2021-12-16 17:15:32 +0800
  • 632e622de8
    overlap computation and communication in 2d operations (#75) HELSON 2021-12-16 16:05:15 +0800
  • 005e192ac1 fixed zero level 3 dtype bug FrankLeeeee 2021-12-16 15:56:39 +0800
  • 72f7c4af95 overlap computation and communication in 2d operations 1SAA 2021-12-16 15:54:13 +0800
  • 4951942f4c add interleaved pipeline and fix naive amp ver217 2021-12-14 16:44:59 +0800
  • 6f8c18d28f overlap computation and communication in 2d operations 1SAA 2021-12-16 11:09:48 +0800
  • 6140e49e6c added CI for unit testing (#69) Frank Lee 2021-12-16 10:32:08 +0800
  • 1cdbc33e28 Update issue templates (#66) Frank Lee 2021-12-14 12:01:46 +0800
  • 9e6374e8b7 update examples and sphnix docs for the new api (#63) Frank Lee 2021-12-13 22:07:01 +0800
  • e073702c6e fix zero3 fp16 and add zero3 model context (#62) ver217 2021-12-10 17:48:50 +0800
  • c0421700f1 update markdown docs (english) (#60) Frank Lee 2021-12-10 14:37:33 +0800
  • def1c6e631 Develop/experiments (#59) Frank Lee 2021-12-09 15:08:29 +0800
  • 028b789370 add how to build tfrecord dataset (#48) ver217 2021-12-02 16:31:23 +0800
  • ff3d3b8c49 add some details in vit-b16 example (#46) ver217 2021-12-02 09:29:27 +0800
  • 2be4a6b8f9 add some details in vit-b16 example (#43) (#44) ver217 2021-12-02 08:55:11 +0800
  • fab4c862f0 add explanation for ViT example (#35) (#36) binmakeswell 2021-11-29 10:25:38 +0800
  • a845123aab add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29) ver217 2021-11-18 23:45:09 +0800
  • 7089b5f5bc Support TP-compatible Torch AMP and Update trainer API (#27) Frank Lee 2021-11-18 19:45:06 +0800
  • 7bfe2548f0 use env to control the language of doc (#24) (#25) ver217 2021-11-15 16:53:56 +0800
  • 67c0b15ff5 remove redundancy func in setup (#19) (#20) ver217 2021-11-15 16:43:28 +0800
  • 8dfef9b6e1 fixed some typos in the documents, added blog link and paper author information in README binmakeswell 2021-11-03 16:07:28 +0800
  • 92dea676c2 added Chinese documents and fixed some typos in English documents Fan Cui 2021-11-02 23:01:13 +0800
  • 5b30975cd7 overlap computation and communication in 2d operations 1SAA 2021-12-16 11:09:48 +0800
  • cd9c28e055
    added CI for unit testing (#69) Frank Lee 2021-12-16 10:32:08 +0800
  • 52f8cb1d2c integrating model zbian 2021-11-03 05:30:59 +0100
  • f9de7fe7ab sync to latest code writing style and modify README xinzhang 2021-12-15 10:41:21 +0000
  • fe79fcf92f sync to latest code writing style xinzhang 2021-12-15 10:37:31 +0000
  • 9165aa8ebd added CI for unit testing FrankLeeeee 2021-12-14 05:12:24 +0100
  • cb86b98489 add interleaved pipeline ver217 2021-12-14 16:44:59 +0800
  • 1a9192f8d6 Overlap computation and communication in 2d operation 1SAA 2021-12-15 11:53:18 +0800
  • 1ade53dada
    Merge branch 'hpcaitech:main' into taskv2-branch Xin Zhang 2021-12-15 10:12:04 +0800
  • d7ace4039f add feat: gpt model (1d and vanilla) WANG-CR 2021-12-14 12:04:39 +0800
  • 45355a62f7
    Update issue templates (#66) Frank Lee 2021-12-14 12:01:46 +0800
  • 4b421aaced Update issue templates Frank Lee 2021-12-14 11:14:04 +0800
  • ddccdc488d add activation checkpoint offload ver217 2021-12-13 15:33:30 +0800
  • 35813ed3c4
    update examples and sphnix docs for the new api (#63) Frank Lee 2021-12-13 22:07:01 +0800
  • 0b22e5423f update examples and sphnix docs for the new api Frank Lee 2021-12-10 15:33:19 +0800
  • d7876f82f5
    Merge branch 'hpcaitech:main' into taskv2-branch Xin Zhang 2021-12-11 23:44:11 +0800
  • 9d6c58876d updated xinzhang 2021-12-11 15:12:29 +0000
  • aef05d5f00 Merge branch 'main' of https://github.com/ExtremeViscent/ColossalAI extremeviscent 2021-12-11 07:57:37 +0800
  • 46180562b4 Merge branch 'main' of https://github.com/hpcaitech/ColossalAI extremeviscent 2021-12-11 07:56:10 +0800
  • a23eabca80
    Merge branch 'hpcaitech:main' into main ExtremeViscent 2021-12-10 19:47:31 +0000
  • b7dc7a7677 Deleted irrelavant folder extremeviscent 2021-12-11 03:46:33 +0800
  • 8ee33b1ce8 MPI-backended Segmentation Example extremeviscent 2021-12-11 03:44:56 +0800
  • 7d3711058f
    fix zero3 fp16 and add zero3 model context (#62) ver217 2021-12-10 17:48:50 +0800
  • 016235789b fix zero3 fp16 and add zero3 model context ver217 2021-12-09 20:16:23 +0800
  • 157a4c569b fix zero3 fp16 and add zero3 model context ver217 2021-12-09 20:16:23 +0800
  • 9a0466534c
    update markdown docs (english) (#60) Frank Lee 2021-12-10 14:37:33 +0800
  • 993088d45e
    Added 2 examples rahulgupta9202 2021-12-10 11:54:48 +0530
  • cb79caaa63 update markdown docs (english) FrankLeeeee 2021-12-10 07:17:30 +0100
  • ffc01a3f43
    Delete CIFAR10_image_classifier_using_ResNet50.ipynb rahulgupta9202 2021-12-10 11:34:25 +0530
  • a82a514622
    Delete DogsVcat_inceptionCNN(transfer_learning).ipynb rahulgupta9202 2021-12-10 11:34:15 +0530
  • da01c234e1
    Develop/experiments (#59) Frank Lee 2021-12-09 15:08:29 +0800
  • e731494b94 update api for better usability (#58) Frank Lee 2021-12-09 12:09:41 +0800
  • 7701238f69 Update 2.5d layer code to get a similar accuracy on imagenet-1k dataset BoxiangW 2021-12-06 10:17:21 +0800
  • 893e94ef24 optimized 3d layer to fix slow computation ; tested imagenet performance with 3d; reworked lr_scheduler config definition; fixed launch args; fixed some printing issues; simplified apis of 3d layers (#51) アマデウス 2021-12-04 16:45:01 +0800
  • c9684a14c4 Feature/pipeline (#40) ver217 2021-12-04 15:48:47 +0800
  • 944395813e Feature/ddp (#49) ver217 2021-12-04 15:47:13 +0800
  • d20089db35 fixed 1D and 2D convergence (#38) Frank Lee 2021-12-04 15:44:53 +0800
  • f4dd9f135d Integrate 1d tensor parallel in Colossal-AI (#39) puck_WCR 2021-11-29 14:00:14 +0800
  • 8ee9f5ed79 Split conv2d, class token, positional embedding in 2d, Fix random number in ddp Fix convergence in cifar10, Imagenet1000 1SAA 2021-11-18 17:54:19 +0800
  • 4b032d1ec4 improved consistency between trainer, engine and schedule (#23) Frank Lee 2021-11-15 17:19:41 +0800
  • 71e019ecca Revert "fixed trainer" Frank Lee 2021-11-11 18:07:46 +0800
  • e327a27ea6 fixed trainer Frank Lee 2021-11-11 17:41:45 +0800
  • 70f69be293 fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes Frank Lee 2021-11-10 10:47:58 +0800
  • 74ad4380da fix FP16 optimizer and adapted torch amp with tensor parallel (#18) ver217 2021-11-08 16:47:32 +0800
  • 5ac4a94d67 Add gradient accumulation, fix lr scheduler 1SAA 2021-11-08 15:48:27 +0800
  • 073b2fbc1f update api for better usability FrankLeeeee 2021-12-08 03:52:42 +0100
  • 07f3e980f6 fixed 1D ViT convergence problem FrankLeeeee 2021-12-03 16:53:25 +0100
  • f5fdf496ce
    Add files via upload rahulgupta9202 2021-12-08 05:54:35 +0530
  • 45eb55c650
    Merge pull request #54 from BoxiangW/develop/experiments BoxiangW 2021-12-07 21:18:25 +0800
  • 6c34f7f432 Update 2.5d layer code to get a similar accuracy on imagenet-1k dataset BoxiangW 2021-12-06 10:17:21 +0800
  • f962568a31
    Merge branch 'hpcaitech:main' into main BoxiangW 2021-12-05 14:42:47 +0800
  • 7660a2bf1c
    Update HSI_processing_colossal_example.ipynb hnqin-xdu 2021-12-05 08:25:04 +0800
  • bdb19cb04c
    Add files via upload hnqin-xdu 2021-12-05 07:57:29 +0800
  • ff398efa68
    Delete HSI_processing_colossal_example.ipynb hnqin-xdu 2021-12-05 07:57:02 +0800
  • 4faf094dad
    An example for hyperspectral image processing. hnqin-xdu 2021-12-04 18:25:42 +0800
  • d2f5a7d8df
    optimized 3d layer to fix slow computation ; tested imagenet performance with 3d; reworked lr_scheduler config definition; fixed launch args; fixed some printing issues; simplified apis of 3d layers (#51) アマデウス 2021-12-04 16:45:01 +0800
  • 72e7f5b51c
    Merge branch 'develop/experiments' into 3d-imagenet アマデウス 2021-12-04 15:56:46 +0800
  • 9e72bc15a0
    Feature/pipeline (#40) ver217 2021-12-04 15:48:47 +0800
  • 390a1eee48
    Feature/ddp (#49) ver217 2021-12-04 15:47:13 +0800
  • 83d73ae8dc
    fixed 1D and 2D convergence (#38) Frank Lee 2021-12-04 15:44:53 +0800
  • cff008048a optimized 3d layer to fix slow computation ; tested imagenet performance with 3d; reworked lr_scheduler config definition; fixed launch args; fixed some printing issues; simplified apis of 3d layers zbian 2021-11-19 05:18:12 +0100
  • e29e3a83e5 fixed 1D ViT convergence problem FrankLeeeee 2021-12-03 16:53:25 +0100
  • e556694609 modify timing hook ver217 2021-12-03 18:56:23 +0800
  • e64fca3665 simclr v2, replace nvidia dali dataloader xinzhang 2021-12-02 11:31:43 +0000
  • ad74e8c037 fix grad clip for pipeline ver217 2021-12-02 17:20:58 +0800
  • 515267aa21
    Merge branch 'develop/experiments' into feature/ddp ver217 2021-12-02 16:40:12 +0800