Commit Graph

203 Commits (8106ede07fae7e239203feb815162efdf46975ec)

Author SHA1 Message Date
Frank Lee 8823cc4831
Merge pull request #5310 from hpcaitech/feature/npu
10 months ago
Frank Lee 7cfed5f076
[feat] refactored extension module (#5298)
10 months ago
digger yu bce9499ed3
fix some typo (#5307)
10 months ago
ver217 148469348a Merge branch 'main' into sync/npu
10 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
1 year ago
Xuanlei Zhao 3acbf6d496
[npu] add npu support for hybrid plugin and llama (#5090)
1 year ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067)
1 year ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743)
1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671)
1 year ago
Hongxin Liu ac178ca5c1 [legacy] move builder and registry to legacy (#4603)
1 year ago
Hongxin Liu 8accecd55b [legacy] move engine to legacy (#4560)
1 year ago
Hongxin Liu 16bf4c0221
[test] remove useless tests (#4359)
1 year ago
digger yu a9d1cadc49
fix typo with colossalai/trainer utils zero (#3908)
1 year ago
Hongxin Liu dbb32692d2
[lazy] refactor lazy init (#3891)
2 years ago
digger yu 9265f2d4d7
[NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779)
2 years ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
2 years ago
Hongxin Liu 4341f5e8e6
[lazyinit] fix clone and deepcopy (#3553)
2 years ago
Hongxin Liu 152239bbfa
[gemini] gemini supports lazy init (#3379)
2 years ago
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
2 years ago
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
2 years ago
ver217 f8289d4221
[lazyinit] combine lazy tensor with dtensor (#3204)
2 years ago
ver217 6ae8ed0407
[lazyinit] add correctness verification (#3147)
2 years ago
ver217 ed8f60b93b
[lazyinit] refactor lazy tensor and lazy init ctx (#3131)
2 years ago
ver217 823f3b9cf4
[doc] add deepspeed citation and copyright (#2996)
2 years ago
YH a848091141
Fix port exception type (#2925)
2 years ago
Nikita Shulga 01066152f1
Don't use `torch._six` (#2775)
2 years ago
ver217 f0aa191f51
[gemini] fix colo_init_context (#2683)
2 years ago
HELSON 552183bb74
[polish] polish ColoTensor and its submodules (#2537)
2 years ago
Super Daniel 35c0c0006e
[utils] lazy init. (#2148)
2 years ago
HELSON 7829aa094e
[ddp] add is_ddp_ignored (#2434)
2 years ago
Frank Lee 40d376c566
[setup] support pre-build and jit-build of cuda kernels (#2374)
2 years ago
Jiarui Fang 355ffb386e
[builder] unified cpu_optim fused_optim inferface (#2190)
2 years ago
Jiarui Fang 9587b080ba
[builder] use runtime builder for fused_optim (#2189)
2 years ago
BlueRum b3f73ce1c8
[Gemini] Update coloinit_ctx to support meta_tensor (#2147)
2 years ago
Jiarui Fang 8e14344ec9
[hotfix] fix a type in ColoInitContext (#2106)
2 years ago
Jiarui Fang 05545bfee9
[ColoTensor] throw error when ColoInitContext meets meta parameter. (#2105)
2 years ago
HELSON f6178728a0
[gemini] fix init bugs for modules (#2047)
2 years ago
Jiarui Fang 31c644027b
[hotfix] hotfix Gemini for no leaf modules bug (#2043)
2 years ago
ver217 f8a7148dec
[kernel] move all symlinks of kernel to `colossalai._C` (#1971)
2 years ago
Jiarui Fang 7e24b9b9ee
[Gemini] clean no used MemTraceOp (#1970)
2 years ago
Jiarui Fang 52c6ad26e0
[ColoTensor] reconfig ColoInitContext, decouple default_pg and default_dist_spec. (#1953)
2 years ago
Jiarui Fang 9f4fb3f28a
[ColoTensor] ColoInitContext initialize parameters in shard mode. (#1937)
2 years ago
Frank Lee e6ec99d389
[utils] fixed lazy init context (#1867)
2 years ago
Jiarui Fang 3ce4463fe6
[utils] remove lazy_memory_allocate from ColoInitContext (#1844)
2 years ago
ver217 99870726b1
[CheckpointIO] a uniform checkpoint I/O module (#1689)
2 years ago
HELSON 1468e4bcfc
[zero] add constant placement policy (#1705)
2 years ago
Kirigaya Kazuto 3b2a59b0ba
[pipeline/rank_recorder] fix bug when process data before backward | add a tool for multiple ranks debug (#1681)
2 years ago