1001 Commits (e37f3db40c4abce67e42a5cb2493e3fce0e08de3)

Author SHA1 Message Date
HELSON e37f3db40c
[gemini] add arguments (#2046) 2 years ago
Zihao 6a9158f1fa
[Gemini] free and allocate cuda memory by tensor.storage, add grad hook (#2040) 2 years ago
Jiarui Fang 31c644027b
[hotfix] hotfix Gemini for no leaf modules bug (#2043) 2 years ago
HELSON a1ce02d740
[zero] test gradient accumulation (#1964) 2 years ago
Ziyue Jiang b0936e4a44
[rpc] split with dag (#2028) 2 years ago
Jiarui Fang 96134e7be3
[hotfix] add bert test for gemini fwd bwd (#2035) 2 years ago
YuliangLiu0306 0dbcd4a6f5
[autoparallel] add split handler (#2032) 2 years ago
Jiarui Fang 28aa9a4294
[Gemini] more rigorous unit tests for run_fwd_bwd (#2034) 2 years ago
YuliangLiu0306 81330b0352
[autoparallel] add experimental permute handler (#2029) 2 years ago
Zihao 95c4532fff
[Gemini] paramWrapper paramTracerHook unitest (#2030) 2 years ago
Jiarui Fang 8daf1b4db1
[Gemini] patch for supporting orch.add_ function for ColoTensor (#2003) 2 years ago
Ziyue Jiang 632753abbc
[fx]Split partition with DAG information (#2025) 2 years ago
YuliangLiu0306 ea0f6b8df9
[autoparallel] add runtime pass and numerical test for view handler (#2018) 2 years ago
Zihao a719b89a41
[gemini] param_trace_hook (#2020) 2 years ago
Jiarui Fang 0b0d8f9e17
[hotfix] revert bug PRs (#2016) 2 years ago
Zihao aba3db464d
[Gemini] ParamMemHook (#2008) 2 years ago
Zihao 0160a62a3c
[Gemini] param_tracer_wrapper and test case (#2009) 2 years ago
YuliangLiu0306 1438993113
[autoparallel] add experimental view handler (#2011) 2 years ago
Genghan Zhang d655eea515
[autoparallel] mix gather (#1977) 2 years ago
Frank Lee 2bab6f512c
[release] release v0.1.11rc4 (#2007) 2 years ago
Boyuan Yao 6cd784ffee
[autoparallel] Add metainfo support for F.linear (#1987) 2 years ago
Super Daniel 2edbef13cc
[fx] add more meta_registry for MetaTensor execution. (#2000) 2 years ago
Jiarui Fang a2d3266648
[hotfix] make Gemini work for conv DNN (#1998) 2 years ago
YuliangLiu0306 155891113e
[autoparallel] use pytree map style to process data (#1989) 2 years ago
YuliangLiu0306 35e6b9ec82
[autoparallel] adapt handlers with attention block (#1990) 2 years ago
YuliangLiu0306 05020e50d0
[autoparallel] support more flexible data type (#1967) 2 years ago
Boyuan Yao c26f21d365
[autoparallel] add pooling metainfo (#1968) 2 years ago
Jiarui Fang 3712ac7f90
[Gemini] add bert for MemtracerWrapper unintests (#1982) 2 years ago
Jiarui Fang e481489aa6
[Gemini] MemtracerWrapper unittests (#1981) 2 years ago
Jiarui Fang 31922110ad
[Gemini] memory trace hook (#1978) 2 years ago
Jiarui Fang 0529fcde06
[Gemini] independent runtime tracer (#1974) 2 years ago
YuliangLiu0306 0da1d00399
[autoparallel] support distributed dataloader option (#1906) 2 years ago
Genghan Zhang 6630d45546
[autoparallel] Add alpha beta (#1973) 2 years ago
Jiarui Fang cc0ed7cf33
[Gemini] ZeROHookV2 -> GeminiZeROHook (#1972) 2 years ago
ver217 f8a7148dec
[kernel] move all symlinks of kernel to `colossalai._C` (#1971) 2 years ago
Jiarui Fang 7e24b9b9ee
[Gemini] clean no used MemTraceOp (#1970) 2 years ago
Boyuan Yao 7c7921f71b
[autoparallel] add torch.nn.ReLU metainfo (#1868) 2 years ago
Jiarui Fang 8c66a1d0aa
[polish] remove useless file _mem_tracer_hook.py (#1963) 2 years ago
Jiarui Fang c4739a725a
[Gemini] polish memstats collector (#1962) 2 years ago
YuliangLiu0306 fea3cb661c
[autoparallel] support addmm in tracer and solver (#1961) 2 years ago
Jiarui Fang f7e276fa71
[Gemini] add GeminiAdamOptimizer (#1960) 2 years ago
HELSON 7066dfbf82
[zero] fix memory leak for zero2 (#1955) 2 years ago
Jiarui Fang 52c6ad26e0
[ColoTensor] reconfig ColoInitContext, decouple default_pg and default_dist_spec. (#1953) 2 years ago
zbian 598d456d0e fixed logger 2 years ago
zbian 6877121377 updated flash attention api 2 years ago
YuliangLiu0306 36c0f3ea5b
[autoparallel] remove redundancy comm node (#1893) 2 years ago
アマデウス e52f9d9109
[tensorparallel] fixed tp layers (#1938) 2 years ago
Jiarui Fang 9f4fb3f28a
[ColoTensor] ColoInitContext initialize parameters in shard mode. (#1937) 2 years ago
Boyuan Yao d5c5bc219e
[SC] add GPT example for auto checkpoint (#1889) 2 years ago
Junming Wu 14a0b18305
[NFC] polish colossalai/amp/naive_amp/__init__.py code style (#1905) 2 years ago