Commit Graph

22 Commits (b0b8ad28237707deae64ac92e98aad17ac76e1b4)

Author SHA1 Message Date
Frank Lee 1fb0d95df0 [shardformer] made tensor parallelism configurable (#4144)
1 year ago
Frank Lee 74257cb446 [shardformer] refactored some doc and api (#4137)
1 year ago
Frank Lee ae035d305d [shardformer] added embedding gradient check (#4124)
1 year ago
Frank Lee 6a88bae4ec [shardformer] integrate with data parallelism (#4103)
1 year ago
Frank Lee f3b6aaa6b7 [shardformer] supported fused normalization (#4112)
1 year ago
Frank Lee b1c2901530 [shardformer] supported bloom model (#4098)
1 year ago
Kun Lin 8af29ee47a [shardformer] support vision transformer (#4096)
1 year ago
jiangmingyan ac80937138 [shardformer] shardformer support opt models (#4091)
1 year ago
FoolPlayer 92f6791095 [shardformer] Add layernorm (#4072)
1 year ago
Frank Lee 70c58cfd4f [shardformer] supported fused qkv checkpoint (#4073)
1 year ago
FoolPlayer 0803a61412 [shardformer] add linearconv1d test (#4067)
1 year ago
FoolPlayer 7740c55c55 support kit use for bert/gpt test (#4055)
1 year ago
Frank Lee 58df720570 [shardformer] adapted T5 and LLaMa test to use kit (#4049)
1 year ago
FoolPlayer 4021b9a8a2 [shardformer] add gpt2 test and layer class refactor (#4041)
1 year ago
Frank Lee d857f3dbba [shardformer] supported T5 and its variants (#4045)
1 year ago
Frank Lee c1d5453e9f [shardformer] adapted llama to the new API (#4036)
1 year ago
FoolPlayer 74d176c8d8 [shardformer] fix bert and gpt downstream with new api (#4024)
1 year ago
FoolPlayer dfca9678fa integrate with dist layer (#4011)
1 year ago
FoolPlayer f7774ec0f3 [Shardformer] Downstream bert (#3979)
1 year ago
wukong1992 c1c672d0f0 [shardformer] shardformer support t5 model (#3994)
1 year ago
wukong1992 6b30dfb7ce [shardformer] support llama model using shardformer (#3969)
1 year ago
FoolPlayer f1cb5ac6bf [shardformer] Align bert value (#3907)
1 year ago