* [misc] remove config arg from initialize
* [misc] remove old tensor contrusctor
* [plugin] add npu support for ddp
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [devops] fix doc test ci
* [test] fix test launch
* [doc] update launch doc
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [npu] setup device utils (#5047)
* [npu] add npu device support
* [npu] support low level zero
* [test] update npu zero plugin test
* [hotfix] fix import
* [test] recover tests
* [npu] gemini support npu (#5052)
* [npu] refactor device utils
* [gemini] support npu
* [example] llama2+gemini support npu
* [kernel] add arm cpu adam kernel (#5065)
* [kernel] add arm cpu adam
* [optim] update adam optimizer
* [kernel] arm cpu adam remove bf16 support
* [kernel] support pure fp16 for cpu adam (#4896)
* [kernel] fix cpu adam kernel for pure fp16 and update tests (#4919)
* [kernel] fix cpu adam
* [test] update gemini optim test
* [legacy] move communication to legacy (#4640)
* [legacy] refactor logger and clean up legacy codes (#4654)
* [legacy] make logger independent to gpc
* [legacy] make optim independent to registry
* [legacy] move test engine to legacy
* [legacy] move nn to legacy (#4656)
* [legacy] move nn to legacy
* [checkpointio] fix save hf config
* [test] remove useledd rpc pp test
* [legacy] fix nn init
* [example] skip tutorial hybriad parallel example
* [devops] test doc check
* [devops] test doc check
* add dropout layer, add dropout test
* modify seed manager as context manager
* add a copy of col_nn.layer
* add dist_crossentropy loss; separate module test
* polish the code
* fix dist crossentropy loss
* init shardformer code structure
* add implement of sharder (inject and replace)
* add implement of replace layer to colossal layer
* separate different layer policy, add some notion
* implement 1d and 2d slicer, can tell col or row
* fix bug when slicing and inject model
* fix some bug; add inference test example
* add share weight and train example
* add train
* add docstring and readme
* add docstring for other files
* pre-commit
* add dropout layer, add dropout test
* modify seed manager as context manager
* add a copy of col_nn.layer
* add dist_crossentropy loss; separate module test
* polish the code
* fix dist crossentropy loss
* init shardformer code structure
* add implement of sharder (inject and replace)
* add implement of replace layer to colossal layer
* separate different layer policy, add some notion
* implement 1d and 2d slicer, can tell col or row
* fix bug when slicing and inject model
* fix some bug; add inference test example
* add share weight and train example
* add train
* add docstring and readme
* add docstring for other files
* pre-commit
* [bf16] add bf16 support for fused adam (#3844)
* [bf16] fused adam kernel support bf16
* [test] update fused adam kernel test
* [test] update fused adam test
* [bf16] cpu adam and hybrid adam optimizers support bf16 (#3860)
* [bf16] implement mixed precision mixin and add bf16 support for low level zero (#3869)
* [bf16] add mixed precision mixin
* [bf16] low level zero optim support bf16
* [text] update low level zero test
* [text] fix low level zero grad acc test
* [bf16] add bf16 support for gemini (#3872)
* [bf16] gemini support bf16
* [test] update gemini bf16 test
* [doc] update gemini docstring
* [bf16] add bf16 support for plugins (#3877)
* [bf16] add bf16 support for legacy zero (#3879)
* [zero] init context support bf16
* [zero] legacy zero support bf16
* [test] add zero bf16 test
* [doc] add bf16 related docstring for legacy zero
* Fixed several spelling errors under colossalai
* Fix the spelling error in colossalai and docs directory
* Cautious Changed the spelling error under the example folder
* Update runtime_preparation_pass.py
revert autograft to autograd
* Update search_chunk.py
utile to until
* Update check_installation.py
change misteach to mismatch in line 91
* Update 1D_tensor_parallel.md
revert to perceptron
* Update 2D_tensor_parallel.md
revert to perceptron in line 73
* Update 2p5D_tensor_parallel.md
revert to perceptron in line 71
* Update 3D_tensor_parallel.md
revert to perceptron in line 80
* Update README.md
revert to resnet in line 42
* Update reorder_graph.py
revert to indice in line 7
* Update p2p.py
revert to megatron in line 94
* Update initialize.py
revert to torchrun in line 198
* Update routers.py
change to detailed in line 63
* Update routers.py
change to detailed in line 146
* Update README.md
revert random number in line 402