Commit Graph

182 Commits (ed4c4484880b733894e6088e681f7cca32afe0b4)

Author SHA1 Message Date
Baizhou Zhang 0ceec8f9a9 [pipeline] support fp32 for HybridPlugin/merge shardformer test and pipeline test into one file (#4354)
1 year ago
Hongxin Liu d921ce8391 [shardformer] support inplace sharding (#4251)
1 year ago
Frank Lee 190a6ea9c2
[dtensor] fixed readme file name and removed deprecated file (#4162)
1 year ago
Frank Lee c4b1b65931 [test] fixed tests failed due to dtensor change (#4082)
1 year ago
Frank Lee 70c58cfd4f [shardformer] supported fused qkv checkpoint (#4073)
1 year ago
Frank Lee 8eb09a4c69 [shardformer] support module saving and loading (#4062)
1 year ago
Frank Lee 45d9384346 [shardformer] removed inplace tensor sharding (#4018)
1 year ago
Frank Lee 015af592f8 [shardformer] integrated linear 1D with dtensor (#3996)
1 year ago
FoolPlayer a2f9af810d [shardformer] fix an error in readme (#3988)
1 year ago
Frank Lee ddcf58cacf
Revert "[sync] sync feature/shardformer with develop"
1 year ago
Frank Lee eb39154d40
[dtensor] updated api and doc (#3845)
1 year ago
Frank Lee d51e83d642
Merge pull request #3916 from FrankLeeeee/sync/dtensor-with-develop
2 years ago
digger yu 0e484e6201
[nfc]fix typo colossalai/pipeline tensor nn (#3899)
2 years ago
Hongxin Liu 7c9f2ed6dd
[dtensor] polish sharding spec docstring (#3838)
2 years ago
YH 2629f9717d
[tensor] Refactor handle_trans_spec in DistSpecManager
2 years ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
2 years ago
YH 8f740deb53
Fix typo (#3448)
2 years ago
YH 1a229045af
Add interface for colo tesnor dp size (#3227)
2 years ago
YuliangLiu0306 258b43317c
[hotfix] layout converting issue (#3188)
2 years ago
YuliangLiu0306 2eca4cd376
[DTensor] refactor dtensor with new components (#3089)
2 years ago
YuliangLiu0306 8e4e8601b7
[DTensor] implement layout converter (#3055)
2 years ago
YuliangLiu0306 29386a54e6
[DTensor] refactor CommSpec (#3034)
2 years ago
YuliangLiu0306 cd2b0eaa8d
[DTensor] refactor sharding spec (#2987)
2 years ago
YuliangLiu0306 e414e4092b
[DTensor] implementation of dtensor (#2946)
2 years ago
YuliangLiu0306 47fb214b3b
[hotfix] add shard dim to aviod backward communication error (#2954)
2 years ago
Jiatong (Julius) Han 8c8a39be95
[hotfix]: Remove math.prod dependency (#2837)
2 years ago
HELSON 552183bb74
[polish] polish ColoTensor and its submodules (#2537)
2 years ago
YuliangLiu0306 aa0f6686f9
[autoparallel] accelerate gpt2 training (#2495)
2 years ago
HELSON 707b11d4a0
[gemini] update ddp strict mode (#2518)
2 years ago
Jiarui Fang 8f72b6f8fb
[hotfix] fix implement error in diffusers
2 years ago
1SAA 33f3023e19 [hotfix] fix implement error in diffusers
2 years ago
Jiarui Fang 1aaeb596c6
[example] gpt, shard init on all processes (#2366)
2 years ago
Boyuan Yao 22e947f982
[autoparallel] fix runtime apply memory estimation (#2281)
2 years ago
xcnick 85178a397a
[hotfix] fix error for torch 2.0 (#2243)
2 years ago
Boyuan Yao 24246f7aa5
[autoparallel] Attach input, buffer and output tensor to MetaInfo class (#2162)
2 years ago
HELSON 2458659919
[zero] fix error for BEiT models (#2169)
2 years ago
Boyuan Yao cfe2a9bd90
[autoparallel] memory estimation for shape consistency (#2144)
2 years ago
Jiarui Fang 2827f41898
[Gemini] GeminiDPP convert to PyTorch Module. (#2151)
2 years ago
Jiarui Fang e99edfcb51
[NFC] polish comments for Chunk class (#2116)
2 years ago
Jiarui Fang b3b89865e2
[Gemini] ParamOpHook -> ColoParamOpHook (#2080)
2 years ago
YuliangLiu0306 81330b0352
[autoparallel] add experimental permute handler (#2029)
2 years ago
Genghan Zhang d655eea515
[autoparallel] mix gather (#1977)
2 years ago
YuliangLiu0306 36c0f3ea5b
[autoparallel] remove redundancy comm node (#1893)
2 years ago
YuliangLiu0306 49216d7ab1
[autoparallel] fix bugs caused by negative dim key (#1808)
2 years ago
Jiarui Fang 218c75fd9d
[NFC] polish type hint for shape consistency (#1801)
2 years ago
HELSON c6a1a62636
[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786)
2 years ago
Frank Lee f3f19a5c47
[autoparallel] added matmul handler (#1763)
2 years ago
YuliangLiu0306 b0f7c8bde8
[autoparallel] update CommSpec to CommActions (#1768)
2 years ago
YuliangLiu0306 b4cc59b61e
[autoparallel] add numerical test for node strategies (#1760)
2 years ago
YuliangLiu0306 980ed21723
[autoparallel] shard param and buffer as expected (#1753)
2 years ago