Commit Graph

672 Commits (c9ec5190a076b130c72ab8a86c35626ac6e3d5e7)

Author SHA1 Message Date
Jiarui Fang 1e885329f4
[test] align model name with the file name. (#2045)
2 years ago
Jiarui Fang 31c644027b
[hotfix] hotfix Gemini for no leaf modules bug (#2043)
2 years ago
HELSON 384cd26314
[zero] fix testing parameters (#2042)
2 years ago
HELSON 17a3c685b0
[zero] fix unit-tests (#2039)
2 years ago
Jiarui Fang eb7742a4bb
[Gemini] more tests for Gemini (#2038)
2 years ago
HELSON 537e181705
[testing] fix testing models (#2036)
2 years ago
HELSON a1ce02d740
[zero] test gradient accumulation (#1964)
2 years ago
Ziyue Jiang b0936e4a44
[rpc] split with dag (#2028)
2 years ago
Jiarui Fang 96134e7be3
[hotfix] add bert test for gemini fwd bwd (#2035)
2 years ago
YuliangLiu0306 0dbcd4a6f5
[autoparallel] add split handler (#2032)
2 years ago
Jiarui Fang 28aa9a4294
[Gemini] more rigorous unit tests for run_fwd_bwd (#2034)
2 years ago
YuliangLiu0306 81330b0352
[autoparallel] add experimental permute handler (#2029)
2 years ago
Zihao 95c4532fff
[Gemini] paramWrapper paramTracerHook unitest (#2030)
2 years ago
Jiarui Fang 8daf1b4db1
[Gemini] patch for supporting orch.add_ function for ColoTensor (#2003)
2 years ago
Ziyue Jiang 632753abbc
[fx]Split partition with DAG information (#2025)
2 years ago
YuliangLiu0306 ea0f6b8df9
[autoparallel] add runtime pass and numerical test for view handler (#2018)
2 years ago
Jiarui Fang 2e9cbfca12
[Gemini] add unitests to check gemini correctness (#2015)
2 years ago
Jiarui Fang 0b0d8f9e17
[hotfix] revert bug PRs (#2016)
2 years ago
Zihao 0160a62a3c
[Gemini] param_tracer_wrapper and test case (#2009)
2 years ago
YuliangLiu0306 1438993113
[autoparallel] add experimental view handler (#2011)
2 years ago
Genghan Zhang d655eea515
[autoparallel] mix gather (#1977)
2 years ago
Jiarui Fang 3d907faede
[Gemini] add an inline_op_module to common test models and polish unitests. (#2004)
2 years ago
Boyuan Yao 6cd784ffee
[autoparallel] Add metainfo support for F.linear (#1987)
2 years ago
YuliangLiu0306 35e6b9ec82
[autoparallel] adapt handlers with attention block (#1990)
2 years ago
Jiarui Fang 5bec3b2168
[Gemini] open grad checkpoint when model building (#1984)
2 years ago
Boyuan Yao c26f21d365
[autoparallel] add pooling metainfo (#1968)
2 years ago
Jiarui Fang 3712ac7f90
[Gemini] add bert for MemtracerWrapper unintests (#1982)
2 years ago
Jiarui Fang e481489aa6
[Gemini] MemtracerWrapper unittests (#1981)
2 years ago
YuliangLiu0306 0da1d00399
[autoparallel] support distributed dataloader option (#1906)
2 years ago
Genghan Zhang 6630d45546
[autoparallel] Add alpha beta (#1973)
2 years ago
ver217 f8a7148dec
[kernel] move all symlinks of kernel to `colossalai._C` (#1971)
2 years ago
Boyuan Yao 7c7921f71b
[autoparallel] add torch.nn.ReLU metainfo (#1868)
2 years ago
YuliangLiu0306 fea3cb661c
[autoparallel] support addmm in tracer and solver (#1961)
2 years ago
Jiarui Fang f7e276fa71
[Gemini] add GeminiAdamOptimizer (#1960)
2 years ago
HELSON 7066dfbf82
[zero] fix memory leak for zero2 (#1955)
2 years ago
Jiarui Fang 52c6ad26e0
[ColoTensor] reconfig ColoInitContext, decouple default_pg and default_dist_spec. (#1953)
2 years ago
zbian 6877121377 updated flash attention api
2 years ago
Jiarui Fang 9f4fb3f28a
[ColoTensor] ColoInitContext initialize parameters in shard mode. (#1937)
2 years ago
HELSON 6e51d296f0
[zero] migrate zero1&2 (#1878)
2 years ago
Jiarui Fang 51597f6a28
[hotfix] pass test_complete_workflow (#1877)
2 years ago
Jiarui Fang 986f8cbaa7
[inference] overlap comm and compute in Linear1D_Row when stream_chunk_num > 1 (#1876)
2 years ago
YuliangLiu0306 1b494ad73c
[autoparallel] fix linear logical convert issue (#1857)
2 years ago
Jiarui Fang c2947dadf1
[inference] streaming Linear 1D Row inference (#1874)
2 years ago
xcnick a141681260
[amp] add torch amp test (#1860)
2 years ago
Frank Lee e6ec99d389
[utils] fixed lazy init context (#1867)
2 years ago
Jiarui Fang 3ce4463fe6
[utils] remove lazy_memory_allocate from ColoInitContext (#1844)
2 years ago
YuliangLiu0306 f6032ddb17
[autoparallel] fix bias addition module (#1800)
2 years ago
ver217 99870726b1
[CheckpointIO] a uniform checkpoint I/O module (#1689)
2 years ago
Boyuan Yao 629172b319
[autoparallel] add batch norm metainfo (#1815)
2 years ago
Super Daniel 441d584e4a
[fx] add a symbolic_trace api. (#1812)
2 years ago
Jiarui Fang 6fa71d65d3
[fx] skip diffusers unitest if it is not installed (#1799)
2 years ago
oahzxl 9639ea88fc
[kernel] more flexible flashatt interface (#1804)
2 years ago
Boyuan Yao 327d07c44a
[autoparallel] add conv metainfo class for auto parallel (#1796)
2 years ago
oahzxl 501a9e9cd2
[hotfix] polish flash attention (#1802)
2 years ago
Jiarui Fang c248800359
[kernel] skip tests of flash_attn and triton when they are not available (#1798)
2 years ago
YuliangLiu0306 e34e850a4c
[autoparallel]add essential CommActions for broadcast oprands (#1793)
2 years ago
Boyuan Yao 05ce3d369f
[fx] Add linear metainfo class for auto parallel (#1783)
2 years ago
YuliangLiu0306 2c4c7b3618
[autoparallel] add getattr handler (#1767)
2 years ago
HELSON c6a1a62636
[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786)
2 years ago
Jiarui Fang 32c1b843a9
skip torchrec unittests if not installed (#1790)
2 years ago
kurisusnowdeng 0b8161fab8 updated tp layers
2 years ago
YuliangLiu0306 e859380bf7
[fx] support module with bias addition (#1780)
2 years ago
Frank Lee f3f19a5c47
[autoparallel] added matmul handler (#1763)
2 years ago
YuliangLiu0306 27de252334
[autoparallel] fix conv handler numerical test (#1771)
2 years ago
Super Daniel 1e88811c7a
[autoparallel] move ckpt solvers to autoparallel folder / refactor code (#1764)
2 years ago
YuliangLiu0306 a4d1f59c78
[autoparallel] add numerical test for handlers (#1769)
2 years ago
YuliangLiu0306 b0f7c8bde8
[autoparallel] update CommSpec to CommActions (#1768)
2 years ago
YuliangLiu0306 b4cc59b61e
[autoparallel] add numerical test for node strategies (#1760)
2 years ago
oahzxl 25952b67d7
[feat] add flash attention (#1762)
2 years ago
Super Daniel 0584654c79
[fx] refactor memory utils and extend shard utils. (#1754)
2 years ago
YuliangLiu0306 314d8c497f
[autoparallel] refactor the runtime apply pass and add docstring to passes (#1757)
2 years ago
Frank Lee f9a613d660
[autoparallel] added binary elementwise node handler (#1758)
2 years ago
YuliangLiu0306 d2fc067231
[autoparallel] fix param hook issue in transform pass (#1755)
2 years ago
Frank Lee 262652c8bc
[autoparallel] added addbmm handler (#1751)
2 years ago
YuliangLiu0306 980ed21723
[autoparallel] shard param and buffer as expected (#1753)
2 years ago
YuliangLiu0306 cdb7d5e7d2
[hotfix] autoparallel unit test (#1752)
2 years ago
YuliangLiu0306 a4ce180e85
[autoparallel] add sequential order to communication actions (#1735)
2 years ago
Super Daniel b893342f95
[fx] test tracer on diffuser modules. (#1750)
2 years ago
Frank Lee b80b6eaa88
[autoparallel] recovered skipped test cases (#1748)
2 years ago
Frank Lee 474111ecb5
[autoparallel] fixed wrong sharding strategy in conv handler (#1747)
2 years ago
Frank Lee 8b8937d901
[autoparallel] fixed wrong generated strategy for dot op (#1746)
2 years ago
Frank Lee 88a79814fb
[autoparallel] handled illegal strategy in node handler (#1743)
2 years ago
Super Daniel 30874f1692
[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler (#1730)
2 years ago
Frank Lee eee84908d4
[autoparallel] handled illegal sharding strategy (#1728)
2 years ago
Ziheng Qin cbe9a4cb45 [NFC] polish tests/test_layers/test_3d/test_3d.py code style (#1740)
2 years ago
lucasliunju 912eb58ea0 [NFC] polish tests/test_layers/test_3d/checks_3d/common.py code style (#1733)
2 years ago
Xue Fuzhao 754aa7c81f [NFC] polish tests/test_layers/test_3d/checks_3d/check_layer_3d.py code style (#1731)
2 years ago
xyupeng ff373a11eb [NFC] polish tests/test_layers/test_sequence/checks_seq/check_layer_seq.py code style (#1723)
2 years ago
Kai Wang (Victor Kai) b38efe4e8a [NFC] polish test_2p5d/checks_2p5d/check_operation_2p5d.py code style (#1718)
2 years ago
binmakeswell f6389d0813 [NFC] polish tests/test_layers/test_2d/checks_2d/check_operation_2d.py code style (#1715)
2 years ago
HELSON f69f9bf223
[zero] add chunk init function for users (#1729)
2 years ago
Super Daniel 393f594051
[fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710)
2 years ago
Frank Lee e8d8eda5e7
[autoparallel] moved tests to test_tensor_shard (#1713)
2 years ago
YuliangLiu0306 845ff4a47a
[autoparallel] resnet block runtime apply (#1709)
2 years ago
Frank Lee 22a115406b
[autoparallel] fixed broken node handler tests (#1708)
2 years ago
HELSON 1468e4bcfc
[zero] add constant placement policy (#1705)
2 years ago
Frank Lee 6c331a5a09
[autoparallel] refactored the autoparallel module for organization (#1706)
2 years ago
Frank Lee 91cd34e6e0
[unittest] added doc for the pytest wrapper (#1704)
2 years ago
YuliangLiu0306 451cd72dea
[autoparallel] adapt runtime passes (#1703)
2 years ago
Jiarui Fang 21962e1593
[embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699)
2 years ago
Frank Lee 0e52f3d3d5
[unittest] supported condititonal testing based on env var (#1701)
2 years ago
Frank Lee 8283e95db3
[autoparallel] collated all deprecated files (#1700)
2 years ago
YuliangLiu0306 81f7530ee7
[autoparallel] adapt solver and CostGraph with new handler (#1695)
2 years ago
YuliangLiu0306 42b882ef06
[autoparallel] add output handler and placeholder handler (#1694)
2 years ago
YuliangLiu0306 56088e6d98
[autoparallel] add pooling handler (#1690)
2 years ago
YuliangLiu0306 319d654f79
[autoparallel] where_handler_v2 (#1688)
2 years ago
Boyuan Yao 31d2f03d27
[autoparallel] fix C version rotor inconsistency (#1691)
2 years ago
Frank Lee 4973157ad7
[autoparallel] added sharding spec conversion for linear handler (#1687)
2 years ago
YuliangLiu0306 af718e83f2
[autoparallel] add reshape handler v2 and fix some previous bug (#1683)
2 years ago
Super Daniel 3dd6994427
[fx/profiler] assigned UUID to each unrecorded tensor/ improved performance on GPT-2 (#1679)
2 years ago
YuliangLiu0306 517b63939a
[autoparallel] add unary element wise handler v2 (#1674)
2 years ago
YuliangLiu0306 f6c6a932b8
[autoparallel] add following node generator (#1673)
2 years ago
YuliangLiu0306 52fda88796
[autoparallel] add layer norm handler v2 (#1671)
2 years ago
HELSON b28991dd0a
[feature] A new ZeRO implementation (#1644)
2 years ago
Boyuan Yao 1df98d5b66
[autoparallel] add rotor C version (#1658)
2 years ago
YuliangLiu0306 11ec070e53
[hotfix]unit test (#1670)
2 years ago
Frank Lee a60024e77a
[autoparallel] added utils for broadcast operation (#1665)
2 years ago
YuliangLiu0306 3f068d1409
[autoparallel] update CommSpec (#1667)
2 years ago
YuliangLiu0306 746f8f979d
[autoparallel] add batch norm handler v2 (#1666)
2 years ago
Kirigaya Kazuto 9708638ded
[pipeline/pytree] add pytree to process args and kwargs | provide `data_process_func` to process args and kwargs after forward (#1642)
2 years ago
Frank Lee 3a4d6f63a8
[autoparallel] added node handler for bmm (#1655)
2 years ago
YuliangLiu0306 095854477f
[autoparallel] add conv handler v2 (#1663)
2 years ago
YuliangLiu0306 1e7816a460
[autoparallel] adapt solver with gpt (#1653)
2 years ago
Frank Lee 30e50c8b4a
[autoparallel] implemented all matmul strategy generator (#1650)
2 years ago
YuliangLiu0306 03978aad45
[autoparallel] change the following nodes strategies generation logic (#1636)
2 years ago
YuliangLiu0306 59f100510a
[autoparallel] where handler (#1651)
2 years ago
Boyuan Yao 5d0fdb9cb4
[fx] fix offload codegen test (#1648)
2 years ago
Frank Lee 45b39a692a
[autoparallel] implemented linear projection strategy generator (#1639)
2 years ago
Frank Lee 154d3ef432
[fix] fixed the collective pattern name for consistency (#1649)
2 years ago
YuliangLiu0306 b2b2a4af98
[autoparallel] adapt solver with mlp (#1638)
2 years ago
Jiarui Fang c5d39215f6
Revert "[feature] new zero implementation (#1623)" (#1643)
2 years ago
HELSON 5be118f405
[feature] new zero implementation (#1623)
2 years ago
HELSON 95c35f73bd
[moe] initialize MoE groups by ProcessGroup (#1640)
2 years ago
HELSON a088022efc
[moe] fix moe bugs (#1633)
2 years ago
YuliangLiu0306 702dbc5288
[tensor] use communication autograd func (#1617)
2 years ago
YuliangLiu0306 0c703189b9
[autoparallel] add layernorm handler (#1629)
2 years ago
YuliangLiu0306 bf77d3ab65
[autoparallel] recover the merged node strategy index (#1613)
2 years ago
Boyuan Yao d6b01feb66
[fx] Modify offload codegen (#1618)
2 years ago
YuliangLiu0306 9eae855408
[hotfix] add recompile after graph manipulatation (#1621)
2 years ago
Super Daniel d967779a32
[fx/profiler] tuned the calculation of memory estimation (#1619)
2 years ago
HELSON f7f2248771
[moe] fix MoE bugs (#1628)
2 years ago
Jiarui Fang 38c68b5b9a
[embedding] rollback for better FAW performance (#1625)
2 years ago
Frank Lee d925122020
[autoparallel] added new linear module handler (#1616)
2 years ago
Kirigaya Kazuto 170fa81095
[pipeline/chimera] test chimera | fix bug of initializing (#1615)
2 years ago
Jiarui Fang 504ff1d101
[embeddings] use cache_ratio instead of cuda_row_num (#1611)
2 years ago
YuliangLiu0306 7d1bb71d5d
[fx] PoC of runtime shape consistency application (#1607)
2 years ago
YuliangLiu0306 47b11c432c
[autoparallel]add bcast matmul strategies (#1605)
2 years ago
Boyuan Yao 933b6c6367
[fx] Add pofo solver (#1608)
2 years ago
Kirigaya Kazuto edc9e419ad
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595)
2 years ago
YuliangLiu0306 eac1b79371
[autoparallel] add bcast op handler (#1600)
2 years ago