Commit Graph

953 Commits (ff16773ded5ffc24a87a189f2b0cb5f14cd4702d)

Author SHA1 Message Date
Boyuan Yao d5c5bc219e
[SC] add GPT example for auto checkpoint (#1889)
2 years ago
Junming Wu 14a0b18305
[NFC] polish colossalai/amp/naive_amp/__init__.py code style (#1905)
2 years ago
HELSON 6e51d296f0
[zero] migrate zero1&2 (#1878)
2 years ago
Super Daniel cc55ff0aa4
[autoparallel] user-friendly API for CheckpointSolver. (#1879)
2 years ago
Super Daniel 448248b27c
[fx] metainfo_trace as an API. (#1873)
2 years ago
Jiarui Fang 986f8cbaa7
[inference] overlap comm and compute in Linear1D_Row when stream_chunk_num > 1 (#1876)
2 years ago
YuliangLiu0306 1b494ad73c
[autoparallel] fix linear logical convert issue (#1857)
2 years ago
Jiarui Fang c2947dadf1
[inference] streaming Linear 1D Row inference (#1874)
2 years ago
Frank Lee e6ec99d389
[utils] fixed lazy init context (#1867)
2 years ago
zbian 653b0a620e added skip_bias_add for non-tp linear
2 years ago
LuGY 94329fc139
[NFC] polish colossalai/amp/apex_amp/__init__.py code style (#1853)
2 years ago
zbian 1559a09fb7 [NFC] polish amp.naive_amp.grad_scaler code style
2 years ago
HELSON 72c9448920 [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/operator_handler.py code style (#1845)
2 years ago
Genghan Zhang b25030cc07 [NFC] polish ./colossalai/amp/torch_amp/__init__.py code style (#1836)
2 years ago
Sze-qq 95ac4f88ea [NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/conv_handler.py code style (#1829)
2 years ago
Ziyue Jiang 5da03c936d [NFC] polish colossalai/amp/torch_amp/_grad_scaler.py code style (#1823)
2 years ago
Fazzie-Maqianli 399f84d8f6 [NFC] polish colossalai/amp/naive_amp/_fp16_optimizer.py code style (#1819)
2 years ago
CsRic 9623ec1b02 [NFC] polish colossalai/amp/naive_amp/_utils.py code style (#1816)
2 years ago
binmakeswell 3c3714fc2a [NFC] polish strategies_constructor.py code style (#1806)
2 years ago
Jiarui Fang 3ce4463fe6
[utils] remove lazy_memory_allocate from ColoInitContext (#1844)
2 years ago
Jiarui Fang fba34efb5a
version to 0.1.11rc2 (#1832)
2 years ago
YuliangLiu0306 49216d7ab1
[autoparallel] fix bugs caused by negative dim key (#1808)
2 years ago
アマデウス 4268ae017b
[kernel] added jit warmup (#1792)
2 years ago
YuliangLiu0306 f6032ddb17
[autoparallel] fix bias addition module (#1800)
2 years ago
Jiarui Fang cd5a0d56fa
[Gemini] make gemini usage simple (#1821)
2 years ago
ver217 99870726b1
[CheckpointIO] a uniform checkpoint I/O module (#1689)
2 years ago
Boyuan Yao 629172b319
[autoparallel] add batch norm metainfo (#1815)
2 years ago
Super Daniel 441d584e4a
[fx] add a symbolic_trace api. (#1812)
2 years ago
xcnick e0da01ea71
[hotfix] fix build error when torch version >= 1.13 (#1803)
2 years ago
oahzxl 9639ea88fc
[kernel] more flexible flashatt interface (#1804)
2 years ago
Zihao 20e255d4e8
MemStatsCollectorStatic (#1765)
2 years ago
Boyuan Yao 327d07c44a
[autoparallel] add conv metainfo class for auto parallel (#1796)
2 years ago
oahzxl 501a9e9cd2
[hotfix] polish flash attention (#1802)
2 years ago
Jiarui Fang 218c75fd9d
[NFC] polish type hint for shape consistency (#1801)
2 years ago
Jiarui Fang c248800359
[kernel] skip tests of flash_attn and triton when they are not available (#1798)
2 years ago
YuliangLiu0306 e34e850a4c
[autoparallel]add essential CommActions for broadcast oprands (#1793)
2 years ago
Boyuan Yao 05ce3d369f
[fx] Add linear metainfo class for auto parallel (#1783)
2 years ago
Super Daniel e8a9bebc87
[autoparallel] refactor and add rotorc. (#1789)
2 years ago
YuliangLiu0306 2c4c7b3618
[autoparallel] add getattr handler (#1767)
2 years ago
HELSON c6a1a62636
[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786)
2 years ago
kurisusnowdeng 0b8161fab8 updated tp layers
2 years ago
Jiarui Fang cb5a587e9a
[hotfix] polish chunk import (#1787)
2 years ago
YuliangLiu0306 e859380bf7
[fx] support module with bias addition (#1780)
2 years ago
Frank Lee f3f19a5c47
[autoparallel] added matmul handler (#1763)
2 years ago
Ziyue Jiang 4df0194976
[Pipeline]Adapt to Pipelinable OPT (#1782)
2 years ago
YuliangLiu0306 27de252334
[autoparallel] fix conv handler numerical test (#1771)
2 years ago
Super Daniel 1e88811c7a
[autoparallel] move ckpt solvers to autoparallel folder / refactor code (#1764)
2 years ago
Jiarui Fang f34dab4270
[compatibility] ChunkMgr import error (#1772)
2 years ago
YuliangLiu0306 b0f7c8bde8
[autoparallel] update CommSpec to CommActions (#1768)
2 years ago
YuliangLiu0306 b4cc59b61e
[autoparallel] add numerical test for node strategies (#1760)
2 years ago