Commit Graph

1609 Commits (eedaa3e1ef991d9f9a274d10c046877ba2b10467)

Author SHA1 Message Date
Hongxin Liu 172f7fa3cf [misc] resolve code factor issues (#4433)
1 year ago
flybird11111 108e54a0b4 [shardformer]update t5 tests for using all optimizations. (#4407)
1 year ago
flybird11111 1edc9b5fb3 [shardformer] update tests for all optimization (#4413)
1 year ago
Baizhou Zhang 7711bd524a [shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (#4395)
1 year ago
flybird1111 d2cd48e0be [shardformer] test all optimizations (#4399)
1 year ago
flybird1111 7a3dfd0c64 [shardformer] update shardformer to use flash attention 2 (#4392)
1 year ago
Baizhou Zhang ed4c448488 [pipeline] rewrite t5 tests & support multi-tensor transmitting in pipeline (#4388)
1 year ago
flybird1111 906426cb44 [Shardformer] Merge flash attention branch to pipeline branch (#4362)
1 year ago
Jianghai a88e92251d [pipeline] add chatglm (#4363)
1 year ago
Baizhou Zhang b1feeced8e [shardformer] add util functions for shardformer tests/fix sync_shared_param (#4366)
1 year ago
FoolPlayer 726541afe2 update some module with new api version
1 year ago
FoolPlayer 879301d0da [shardformer] support Blip2 (#4243)
1 year ago
klhhhhh 8120eca0c0 [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
1 year ago
klhhhhh 91850fe984 [shardformer] register without auto policy
1 year ago
klhhhhh 1a29e8fc29 [shardformer] polish chatglm code
1 year ago
klhhhhh 8620009dd7 [sharformer] add first version of policy of chatglm
1 year ago
Kun Lin ed34bb1310 Feature/chatglm (#4240)
1 year ago
FoolPlayer 9ee4ebea83 [shardformer] support whisper (#4212)
1 year ago
FoolPlayer dd2bf02679 [shardformer] support SAM (#4231)
1 year ago
Kun Lin c59d7aca09 Feature/vit support (#4182)
1 year ago
Baizhou Zhang 0ceec8f9a9 [pipeline] support fp32 for HybridPlugin/merge shardformer test and pipeline test into one file (#4354)
1 year ago
Jianghai f13954cd58 [pipeline] refactor test pipeline and remove useless utils in pipeline (#4324)
1 year ago
Baizhou Zhang da3cef27ad [pipeline] fix return_dict/fix pure_pipeline_test (#4331)
1 year ago
Hongxin Liu 261eab02fb [plugin] add 3d parallel plugin (#4295)
1 year ago
FoolPlayer b3f5d7a3ba [shardformer] support pipeline base vit model (#4284)
1 year ago
Baizhou Zhang 083d7da33d [pipeline] add pipeline support for all T5 models (#4310)
1 year ago
Jianghai d0807122e2 [pipeline] test pure pipeline process using llama (#4218)
1 year ago
Baizhou Zhang 36e546b2cc [pipeline] add pipeline support for T5Stack/T5EncoderModel (#4300)
1 year ago
Jianghai 18ebcf406a [pipeline] reformat for unified design (#4283)
1 year ago
Jianghai 0a8f3c851a [hotfix] fix opt pipeline (#4293)
1 year ago
Jianghai d8408d185c [pipeline] OPT model pipeline (#4258)
1 year ago
Baizhou Zhang b774d5ea0f [pipeline] refactor gpt2 pipeline forwards (#4287)
1 year ago
Hongxin Liu d921ce8391 [shardformer] support inplace sharding (#4251)
1 year ago
Baizhou Zhang 2a2eacfaf1 [pipeline] support shardformer for GPT2ForQuestionAnswering & complete pipeline support for GPT2 (#4245)
1 year ago
Jianghai 34f0e34a4c [pipeline] finish bloom models pipeline and tests (#4223)
1 year ago
Jianghai e7cc62d735 [pipeline] All bert models (#4233)
1 year ago
Baizhou Zhang a14d352088 [pipeline] add pipeline forward for variants of gpt2 (#4238)
1 year ago
Hongxin Liu 7e4de520e1 [shardformer] fix base policy (#4229)
1 year ago
Baizhou Zhang 208ac8f2ba [pipeline] Add Pipeline Forward for GPT2Model Shardformer (#4224)
1 year ago
Jianghai 37d22f6878 [pipeline] add bloom model pipeline (#4210)
1 year ago
Jianghai 31bcf867ae [pipeline] Llama causal lm and llama for sequence classification pipeline (#4208)
1 year ago
Jianghai 1622031058 [pipeline] Llama pipeline (#4205)
1 year ago
Jianghai 1094e0f0d3 [pipeline] Bert pipeline for shardformer and its tests (#4197)
1 year ago
Hongxin Liu 890774b2fb [shardformer] support lazy init (#4202)
1 year ago
Jianghai f3bcc292c8 [pipeline] move bert related pipeline components to shardformer (#4187)
1 year ago
Jianghai c5ea728016 [pipeline] add bert_for_pretraining bert_lmhead forward and policy (#4172)
1 year ago
ver217 d35bd7d0e6 [shardformer] fix type hint
1 year ago
ver217 1ed3f8a24f [shardformer] rename policy file name
1 year ago
ver217 b0b8ad2823 [pipeline] update shardformer docstring
1 year ago
ver217 59f6f573f1 [pipeline] update shardformer policy
1 year ago
Jianghai 90a65ea682 [pipeline] build bloom model and policy , revise the base class of policy (#4161)
1 year ago
Jianghai e8e7e49243 [pipeline]add pipeline policy and bert forward (#4130)
1 year ago
Hongxin Liu f51ce1bc8e [pipeline] refactor 1f1b schedule (#4115)
1 year ago
Hongxin Liu 45fdc9b42c [pipeline] implement p2p communication (#4100)
1 year ago
Hongxin Liu 422544222f [pipeline] add stage manager (#4093)
1 year ago
Hongxin Liu 5e1a9d48dd [cluster] add process group mesh (#4039)
1 year ago
LuGY d86ddd9b29
[hotfix] fix unsafe async comm in zero (#4404)
1 year ago
Baizhou Zhang 6ccecc0c69
[gemini] fix tensor storage cleaning in state dict collection (#4396)
1 year ago
binmakeswell 089c365fa0
[doc] add Series A Funding and NeurIPS news (#4377)
1 year ago
flybird1111 38b792aab2
[coloattention] fix import error (#4380)
1 year ago
flybird1111 25c57b9fb4
[fix] coloattention support flash attention 2 (#4347)
1 year ago
Hongxin Liu 16bf4c0221
[test] remove useless tests (#4359)
1 year ago
LuGY 03654c0ce2
fix localhost measurement (#4320)
1 year ago
LuGY 45b08f08cb [zero] optimize the optimizer step time (#4221)
1 year ago
LuGY 1a49a5ea00 [zero] support shard optimizer state dict of zero (#4194)
1 year ago
LuGY dd7cc58299 [zero] add state dict for low level zero (#4179)
1 year ago
LuGY c668801d36 [zero] allow passing process group to zero12 (#4153)
1 year ago
LuGY 79cf1b5f33 [zero]support no_sync method for zero1 plugin (#4138)
1 year ago
LuGY c6ab96983a [zero] refactor low level zero for shard evenly (#4030)
1 year ago
dayellow a50d39a143 [NFC] fix: format (#4270)
1 year ago
Wenhao Chen fee553288b [NFC] polish runtime_preparation_pass style (#4266)
1 year ago
YeAnbang 3883db452c [NFC] polish unary_elementwise_generator.py code style (#4267)
1 year ago
梁爽 abe4f971e0 [NFC] polish colossalai/booster/plugin/low_level_zero_plugin.py code style (#4256)
1 year ago
Yanjia0 c614a99d28 [NFC] polish colossalai/auto_parallel/offload/amp_optimizer.py code style (#4255)
1 year ago
ocd_with_naming 85774f0c1f [NFC] polish colossalai/cli/benchmark/utils.py code style (#4254)
1 year ago
Michelle 86cf6aed5b Fix/format (#4261)
1 year ago
Jianghai b366f1d99f [NFC] Fix format for mixed precision (#4253)
1 year ago
Baizhou Zhang c6f6005990
[checkpointio] Sharded Optimizer Checkpoint for Gemini Plugin (#4302)
1 year ago
Hongxin Liu fc5cef2c79
[lazy] support init on cuda (#4269)
1 year ago
Cuiqing Li 4b977541a8
[Kernels] added triton-implemented of self attention for colossal-ai (#4241)
1 year ago
Jianghai 9a4842c571
revise shardformer readme (#4246)
1 year ago
Baizhou Zhang 58913441a1
Next commit [checkpointio] Unsharded Optimizer Checkpoint for Gemini Plugin (#4141)
1 year ago
Frank Lee 190a6ea9c2
[dtensor] fixed readme file name and removed deprecated file (#4162)
1 year ago
Hongxin Liu 1908caad38
[cli] hotfix launch command for multi-nodes (#4165)
1 year ago
digger yu 2ac24040eb
fix some typo colossalai/shardformer (#4160)
1 year ago
github-actions[bot] c77b3b19be
[format] applied code formatting on changed files in pull request 4152 (#4157)
1 year ago
Frank Lee 89f45eda5a [shardformer] added development protocol for standardization (#4149)
1 year ago
Frank Lee 1fb0d95df0 [shardformer] made tensor parallelism configurable (#4144)
1 year ago
Frank Lee 74257cb446 [shardformer] refactored some doc and api (#4137)
1 year ago
jiangmingyan 7f9b30335b [shardformer] write an shardformer example with bert finetuning (#4126)
1 year ago
Frank Lee ae035d305d [shardformer] added embedding gradient check (#4124)
1 year ago
Frank Lee 44a190e6ac [shardformer] import huggingface implicitly (#4101)
1 year ago
Frank Lee 6a88bae4ec [shardformer] integrate with data parallelism (#4103)
1 year ago
Frank Lee f3b6aaa6b7 [shardformer] supported fused normalization (#4112)
1 year ago
Frank Lee b1c2901530 [shardformer] supported bloom model (#4098)
1 year ago
Kun Lin 8af29ee47a [shardformer] support vision transformer (#4096)
1 year ago
jiangmingyan ac80937138 [shardformer] shardformer support opt models (#4091)
1 year ago
Frank Lee d33a44e8c3 [shardformer] refactored layernorm (#4086)
1 year ago
Frank Lee c4b1b65931 [test] fixed tests failed due to dtensor change (#4082)
1 year ago
FoolPlayer 92f6791095 [shardformer] Add layernorm (#4072)
1 year ago