HELSON
e37f3db40c
[gemini] add arguments ( #2046 )
...
* [zero] fix testing parameters
* [gemini] add arguments
* add docstrings
2 years ago
Zihao
6a9158f1fa
[Gemini] free and allocate cuda memory by tensor.storage, add grad hook ( #2040 )
2 years ago
Jiarui Fang
1e885329f4
[test] align model name with the file name. ( #2045 )
2 years ago
Jiarui Fang
31c644027b
[hotfix] hotfix Gemini for no leaf modules bug ( #2043 )
2 years ago
HELSON
384cd26314
[zero] fix testing parameters ( #2042 )
2 years ago
HELSON
17a3c685b0
[zero] fix unit-tests ( #2039 )
2 years ago
Jiarui Fang
eb7742a4bb
[Gemini] more tests for Gemini ( #2038 )
...
* [Gemini] more tests for Gemini
* polish code
2 years ago
HELSON
537e181705
[testing] fix testing models ( #2036 )
...
* [testing] fix testing models
* roll back
2 years ago
HELSON
a1ce02d740
[zero] test gradient accumulation ( #1964 )
...
* [zero] fix memory leak for zero2
* [zero] test gradient accumulation
* [zero] remove grad clip test
2 years ago
Ziyue Jiang
b0936e4a44
[rpc] split with dag ( #2028 )
...
* add DAG to split_module
* add comment
* add test case for DAG
* remove print
* add DAG middleware in scheduler
* add test case for scheduler
* remove break
* recover old lifecycle
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Jiarui Fang
96134e7be3
[hotfix] add bert test for gemini fwd bwd ( #2035 )
2 years ago
YuliangLiu0306
0dbcd4a6f5
[autoparallel] add split handler ( #2032 )
...
* [autoparallel] add split handler
* add numerical test and runtime passes
2 years ago
Jiarui Fang
28aa9a4294
[Gemini] more rigorous unit tests for run_fwd_bwd ( #2034 )
2 years ago
YuliangLiu0306
81330b0352
[autoparallel] add experimental permute handler ( #2029 )
2 years ago
Zihao
95c4532fff
[Gemini] paramWrapper paramTracerHook unitest ( #2030 )
2 years ago
Jiarui Fang
8daf1b4db1
[Gemini] patch for supporting orch.add_ function for ColoTensor ( #2003 )
2 years ago
Ziyue Jiang
632753abbc
[fx]Split partition with DAG information ( #2025 )
...
* add DAG to split_module
* add comment
* add test case for DAG
* remove print
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
YuliangLiu0306
ea0f6b8df9
[autoparallel] add runtime pass and numerical test for view handler ( #2018 )
2 years ago
binmakeswell
bb6245612d
[GitHub] update issue template ( #2023 )
...
* Update bug-report.yml
* Update documentation.yml
* Update bug-report.yml
* Update feature_request.yml
* Update proposal.yml
2 years ago
Zihao
a719b89a41
[gemini] param_trace_hook ( #2020 )
2 years ago
Frank Lee
254ee2c54f
[workflow] removed unused pypi release workflow ( #2022 )
2 years ago
Jiarui Fang
2e9cbfca12
[Gemini] add unitests to check gemini correctness ( #2015 )
2 years ago
Jiarui Fang
0b0d8f9e17
[hotfix] revert bug PRs ( #2016 )
2 years ago
Zihao
aba3db464d
[Gemini] ParamMemHook ( #2008 )
2 years ago
Zihao
0160a62a3c
[Gemini] param_tracer_wrapper and test case ( #2009 )
2 years ago
YuliangLiu0306
1438993113
[autoparallel] add experimental view handler ( #2011 )
...
* [autoparallel] add experimental view handler
* polish
* polish
* polish code
* rename variables
2 years ago
Genghan Zhang
d655eea515
[autoparallel] mix gather ( #1977 )
...
* Add mix-gather
* Add comments
* Add comments
* Polish comments
* Change the global rank assumption
* Add tests
* Add two-step tests
* Fix 10 and 01
* Skip test becasue the number of GPUs
2 years ago
Frank Lee
7242bffc5f
[workflow] fixed the python and cpu arch mismatch ( #2010 )
2 years ago
Frank Lee
2bab6f512c
[release] release v0.1.11rc4 ( #2007 )
2 years ago
Jiarui Fang
3d907faede
[Gemini] add an inline_op_module to common test models and polish unitests. ( #2004 )
2 years ago
Frank Lee
56a3dcdabd
[workflow] fixed the typo in condarc ( #2006 )
2 years ago
Frank Lee
7ad9bd14d8
[workflow] added conda cache and fixed no-compilation bug in release ( #2005 )
2 years ago
Boyuan Yao
6cd784ffee
[autoparallel] Add metainfo support for F.linear ( #1987 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
* [autoparallel] add F.linear metainfo generator
2 years ago
Super Daniel
2edbef13cc
[fx] add more meta_registry for MetaTensor execution. ( #2000 )
...
* [sc] add examples for auto checkpoint.
* merge upstream
* [fx] add more meta_registry for MetaTensor execution.
2 years ago
binmakeswell
d00d905b86
[NFC] polish license ( #1999 )
2 years ago
Jiarui Fang
a2d3266648
[hotfix] make Gemini work for conv DNN ( #1998 )
2 years ago
YuliangLiu0306
155891113e
[autoparallel] use pytree map style to process data ( #1989 )
2 years ago
YuliangLiu0306
35e6b9ec82
[autoparallel] adapt handlers with attention block ( #1990 )
...
* [autoparallel] adapt handlers with attention block
* polish
2 years ago
Fazzie-Maqianli
b5dbb46172
[example] add diffusion inference ( #1986 )
2 years ago
binmakeswell
a01278e810
Update requirements.txt
2 years ago
YuliangLiu0306
05020e50d0
[autoparallel] support more flexible data type ( #1967 )
2 years ago
Jiarui Fang
5bec3b2168
[Gemini] open grad checkpoint when model building ( #1984 )
2 years ago
Boyuan Yao
c26f21d365
[autoparallel] add pooling metainfo ( #1968 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
* [autoparallel] add pooling metainfo
2 years ago
Jiarui Fang
3712ac7f90
[Gemini] add bert for MemtracerWrapper unintests ( #1982 )
2 years ago
Jiarui Fang
e481489aa6
[Gemini] MemtracerWrapper unittests ( #1981 )
2 years ago
mandoxzhang
52bd106627
add RoBERTa ( #1980 )
...
* update roberta
* update roberta & readme
* update roberta & readme
* update roberta & readme
2 years ago
Jiarui Fang
31922110ad
[Gemini] memory trace hook ( #1978 )
2 years ago
Jiarui Fang
0529fcde06
[Gemini] independent runtime tracer ( #1974 )
2 years ago
YuliangLiu0306
0da1d00399
[autoparallel] support distributed dataloader option ( #1906 )
...
* [autoparallel] support distributed dataloader option
* update output handler to support ddp dataloader
* poish code
2 years ago
Genghan Zhang
6630d45546
[autoparallel] Add alpha beta ( #1973 )
...
* Add alpha beta
* Fix test
* Fix test
2 years ago