* [feature] support ep for deepseek v3
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix test
* [shardformer] fix deepseek v3 init
* [lazy] fit lora for lazy init
* [example] support npu for deepseek v3
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* support npu
* support pretrain
support pretrain
fix
* support lora
fix
fix
* support chatglm
fix
fxi
fix
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
fix
fix
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
fix
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
fix
fix
fix
* Update train.py
* Update train.py
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* fix
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [lazy] support _like methods and clamp
* [lazy] pass transformers models
* [lazy] fix device move and requires grad
* [lazy] fix requires grad and refactor api
* [lazy] fix requires grad
* [shardformer] support lazy init
* [shardformer] linear support lazy init
* [shardformer] embedding support lazy init
* [shardformer] norm support lazy init
* [shardformer] fused linear support lazy init
* [test] update shardformer test layer
* [test] shardformer with lazy init fit ddp
* [lazy] hotfix deepcopy of param
* [shardformer] fix bert policy and update test
* [shardformer] fix bloom policy and update test
* [shardformer] fix opt policy and update test
* [shardformer] fix t5 policy and update test
* [shardformer] fix gpt2 policy and update test
* [shardformer] fix llama policy and update test