Commit Graph

20 Commits (b29e1f07224298aea35aab7ee83284beac28e0d8)

Author SHA1 Message Date
digger yu 9265f2d4d7
[NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779)
* fix typo colossalai/autochunk auto_parallel amp

* fix typo colossalai/auto_parallel nn utils etc.
2023-05-23 15:28:20 +08:00
Frank Lee 551cafec14
[doc] updated kernel-related optimisers' docstring (#2385)
* [doc] updated kernel-related optimisers' docstring

* polish doc
2023-01-09 17:13:53 +08:00
Frank Lee 40d376c566
[setup] support pre-build and jit-build of cuda kernels (#2374)
* [setup] support pre-build and jit-build of cuda kernels

* polish code

* polish code

* polish code

* polish code

* polish code

* polish code
2023-01-06 20:50:26 +08:00
HELSON e7d3afc9cc
[optimizer] add div_scale for optimizers (#2117)
* [optimizer] add div_scale for optimizers

* [zero] use div_scale in zero optimizer

* fix testing error
2022-12-12 17:58:57 +08:00
ver217 f8a7148dec
[kernel] move all symlinks of kernel to `colossalai._C` (#1971) 2022-11-17 13:42:33 +08:00
ver217 367c615818
fix nvme docstring (#1450) 2022-08-12 18:01:02 +08:00
ver217 12b4887097
[hotfix] fix CPUAdam kernel nullptr (#1410) 2022-08-05 19:45:45 +08:00
HELSON c7221cb2d4
[hotfix] adapt ProcessGroup and Optimizer to ColoTensor (#1388) 2022-07-29 19:33:24 +08:00
ver217 c415240db6
[nvme] CPUAdam and HybridAdam support NVMe offload (#1360)
* impl nvme optimizer

* update cpu adam

* add unit test

* update hybrid adam

* update docstr

* add TODOs

* update CI

* fix CI

* fix CI

* fix CI path

* fix CI path

* fix CI path

* fix install tensornvme

* fix CI

* fix CI path

* fix CI env variables

* test CI

* test CI

* fix CI

* fix nvme optim __del__

* fix adam __del__

* fix nvme optim

* fix CI env variables

* fix nvme optim import

* test CI

* test CI

* fix CI
2022-07-26 17:25:24 +08:00
HELSON a9b8300d54
[zero] improve adaptability for not-shard parameters (#708)
* adapt post grad hooks for not-shard parameters
* adapt optimizer for not-shard parameters
* offload gradients for not-replicated parameters
2022-04-11 13:38:51 +08:00
HELSON b31daed4cf
fix bugs in CPU adam (#633)
* add cpu adam counter for all cpu adam

* fixed updating error in adam kernel
2022-04-02 17:04:05 +08:00
ver217 e619a651fb
polish optimizer docstring (#619) 2022-04-01 16:27:03 +08:00
LuGY c44d797072
[docs] updatad docs of hybrid adam and cpu adam (#552) 2022-03-30 18:14:59 +08:00
LuGY 105c5301c3
[zero]added hybrid adam, removed loss scale in adam (#527)
* [zero]added hybrid adam, removed loss scale of adam

* remove useless code
2022-03-25 18:03:54 +08:00
ver217 9ec1ce6ab1
[zero] sharded model support the reuse of fp16 shard (#495)
* sharded model supports reuse fp16 shard

* rename variable

* polish code

* polish code

* polish code
2022-03-23 14:59:59 +08:00
ver217 62b0a8d644
[zero] sharded optim support hybrid cpu adam (#486)
* sharded optim support hybrid cpu adam

* update unit test

* polish docstring
2022-03-22 14:56:59 +08:00
Jiarui Fang 0fcfb1e00d
[test] make zero engine test really work (#447) 2022-03-17 17:24:25 +08:00
Jiarui Fang 237d08e7ee
[zero] hybrid cpu adam (#445) 2022-03-17 15:05:41 +08:00
Kai Wang (Victor Kai) 53bb3bcc0a fix format (#362) 2022-03-11 15:50:28 +08:00
LuGY a3269de5c9 [zero] cpu adam kernel (#288)
* Added CPU Adam

* finished the cpu adam

* updated the license

* delete useless parameters, removed resnet

* modified the method off cpu adam unittest

* deleted some useless codes

* removed useless codes

Co-authored-by: ver217 <lhx0217@gmail.com>
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
Co-authored-by: jiaruifang <fangjiarui123@gmail.com>
2022-03-11 15:50:28 +08:00