Commit Graph

953 Commits (ff16773ded5ffc24a87a189f2b0cb5f14cd4702d)

Author SHA1 Message Date
oahzxl 25952b67d7
[feat] add flash attention (#1762)
2 years ago
Super Daniel 0584654c79
[fx] refactor memory utils and extend shard utils. (#1754)
2 years ago
Ziyue Jiang 63f250bbd4
fix file name (#1759)
2 years ago
YuliangLiu0306 314d8c497f
[autoparallel] refactor the runtime apply pass and add docstring to passes (#1757)
2 years ago
Frank Lee f9a613d660
[autoparallel] added binary elementwise node handler (#1758)
2 years ago
YuliangLiu0306 d2fc067231
[autoparallel] fix param hook issue in transform pass (#1755)
2 years ago
Frank Lee 262652c8bc
[autoparallel] added addbmm handler (#1751)
2 years ago
YuliangLiu0306 980ed21723
[autoparallel] shard param and buffer as expected (#1753)
2 years ago
YuliangLiu0306 cdb7d5e7d2
[hotfix] autoparallel unit test (#1752)
2 years ago
YuliangLiu0306 a4ce180e85
[autoparallel] add sequential order to communication actions (#1735)
2 years ago
Frank Lee 474111ecb5
[autoparallel] fixed wrong sharding strategy in conv handler (#1747)
2 years ago
Frank Lee 8b8937d901
[autoparallel] fixed wrong generated strategy for dot op (#1746)
2 years ago
Frank Lee 993b8875b6
[autoparallel] handled illegal sharding strategy in shape consistency (#1744)
2 years ago
Frank Lee 88a79814fb
[autoparallel] handled illegal strategy in node handler (#1743)
2 years ago
Super Daniel 30874f1692
[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler (#1730)
2 years ago
Frank Lee eee84908d4
[autoparallel] handled illegal sharding strategy (#1728)
2 years ago
Sze-qq 23703c9dd6 [NFC] polish colossalai/nn/metric/_utils.py code style (#1727)
2 years ago
Ofey Chan 7e62af28a0 [NFC] polish accuracy_2d.py code style (#1719)
2 years ago
LuGY 730f88f8e1 [NFC] polish _checkpoint_hook.py code style (#1722)
2 years ago
CsRic ea961d8fd1 [NFC] polish colossalai/zero/sharded_param/__init__.py code style (#1717)
2 years ago
yuxuan-lou 2b49ca80a3 [NFC] polish colossalai/nn/lr_scheduler/linear.py code style (#1716)
2 years ago
shenggan e1d780030d [NFC] polish colossalai/nn/metric/accuracy_2p5d.py code style (#1714)
2 years ago
YuliangLiu0306 d373e67b99
[hotfix] resharding cost issue (#1742)
2 years ago
Jiarui Fang 24e84eba60
upgrade version to 0.1.11rc1 (#1739)
2 years ago
Frank Lee d2e0e39c9d
[release] update to v0.1.11 (#1736)
2 years ago
HELSON f69f9bf223
[zero] add chunk init function for users (#1729)
2 years ago
YuliangLiu0306 51b89d2202
[autoparallel] runtime_backward_apply (#1720)
2 years ago
Super Daniel 393f594051
[fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710)
2 years ago
YuliangLiu0306 845ff4a47a
[autoparallel] resnet block runtime apply (#1709)
2 years ago
Frank Lee 22a115406b
[autoparallel] fixed broken node handler tests (#1708)
2 years ago
HELSON 1468e4bcfc
[zero] add constant placement policy (#1705)
2 years ago
binmakeswell 5f41463a76
add optimizer README for tutorials (#1707)
2 years ago
Frank Lee 6c331a5a09
[autoparallel] refactored the autoparallel module for organization (#1706)
2 years ago
Frank Lee 91cd34e6e0
[unittest] added doc for the pytest wrapper (#1704)
2 years ago
YuliangLiu0306 451cd72dea
[autoparallel] adapt runtime passes (#1703)
2 years ago
Jiarui Fang 21962e1593
[embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699)
2 years ago
Frank Lee 0e52f3d3d5
[unittest] supported condititonal testing based on env var (#1701)
2 years ago
Frank Lee 8283e95db3
[autoparallel] collated all deprecated files (#1700)
2 years ago
Frank Lee e2355d01b9
[autoparallel] init new folder structure (#1696)
2 years ago
YuliangLiu0306 81f7530ee7
[autoparallel] adapt solver and CostGraph with new handler (#1695)
2 years ago
YuliangLiu0306 42b882ef06
[autoparallel] add output handler and placeholder handler (#1694)
2 years ago
YuliangLiu0306 56088e6d98
[autoparallel] add pooling handler (#1690)
2 years ago
YuliangLiu0306 319d654f79
[autoparallel] where_handler_v2 (#1688)
2 years ago
Boyuan Yao 31d2f03d27
[autoparallel] fix C version rotor inconsistency (#1691)
2 years ago
Jiarui Fang 363fc2861a
[embeddings] more detailed timer (#1692)
2 years ago
Frank Lee 4973157ad7
[autoparallel] added sharding spec conversion for linear handler (#1687)
2 years ago
YuliangLiu0306 af718e83f2
[autoparallel] add reshape handler v2 and fix some previous bug (#1683)
2 years ago
YuliangLiu0306 6878e42248
[hotfix] solver bug caused by dict type comm cost (#1686)
2 years ago
Super Daniel 3dd6994427
[fx/profiler] assigned UUID to each unrecorded tensor/ improved performance on GPT-2 (#1679)
2 years ago
Kirigaya Kazuto 0df5034a36
[pipeline/fix-bug] num_microbatches support any integrate | stable chimera | launch tool for rpc pp framework (#1684)
2 years ago
jim e5ab6be72e
[hotfix[ fix colotensor.type() raise NotImplementedError (#1682)
2 years ago
Kirigaya Kazuto 3b2a59b0ba
[pipeline/rank_recorder] fix bug when process data before backward | add a tool for multiple ranks debug (#1681)
2 years ago
YuliangLiu0306 517b63939a
[autoparallel] add unary element wise handler v2 (#1674)
2 years ago
YuliangLiu0306 f6c6a932b8
[autoparallel] add following node generator (#1673)
2 years ago
YuliangLiu0306 52fda88796
[autoparallel] add layer norm handler v2 (#1671)
2 years ago
Fazzie-Maqianli 87c5ad352a
update version to 0.1.10 (#1676)
2 years ago
HELSON b28991dd0a
[feature] A new ZeRO implementation (#1644)
2 years ago
Boyuan Yao b1be5b88bd
[autoparallel] fix insecure subprocess (#1680)
2 years ago
Boyuan Yao d8420f81a4
[hotfix] fix wrong type name in profiler (#1678)
2 years ago
Boyuan Yao 132b4306b7
[fx] Add concrete info prop (#1677)
2 years ago
Boyuan Yao 1df98d5b66
[autoparallel] add rotor C version (#1658)
2 years ago
YuliangLiu0306 11ec070e53
[hotfix]unit test (#1670)
2 years ago
Frank Lee a60024e77a
[autoparallel] added utils for broadcast operation (#1665)
2 years ago
YuliangLiu0306 3f068d1409
[autoparallel] update CommSpec (#1667)
2 years ago
Frank Lee 247a9dbca9
[autoparallel] added bias comm spec to matmul strategy (#1664)
2 years ago
YuliangLiu0306 746f8f979d
[autoparallel] add batch norm handler v2 (#1666)
2 years ago
Kirigaya Kazuto 9708638ded
[pipeline/pytree] add pytree to process args and kwargs | provide `data_process_func` to process args and kwargs after forward (#1642)
2 years ago
YuliangLiu0306 c27e701cb2
[autoparallel] remove no strategy nodes (#1652)
2 years ago
Frank Lee 50f16a2850
[autoparallel] added compute resharding costs for node handler (#1662)
2 years ago
Frank Lee 9ec401a722
[autoparallel] added new strategy constructor template (#1661)
2 years ago
Frank Lee 3a4d6f63a8
[autoparallel] added node handler for bmm (#1655)
2 years ago
YuliangLiu0306 095854477f
[autoparallel] add conv handler v2 (#1663)
2 years ago
YuliangLiu0306 1e7816a460
[autoparallel] adapt solver with gpt (#1653)
2 years ago
Jiarui Fang c638bec028
[embedding] polish async copy (#1657)
2 years ago
Jiarui Fang 988570e4a6
[embedding] add more detail profiling (#1656)
2 years ago
Jiarui Fang e1f97fd2b8
[embedding] print profiling results (#1654)
2 years ago
Frank Lee 30e50c8b4a
[autoparallel] implemented all matmul strategy generator (#1650)
2 years ago
YuliangLiu0306 03978aad45
[autoparallel] change the following nodes strategies generation logic (#1636)
2 years ago
YuliangLiu0306 59f100510a
[autoparallel] where handler (#1651)
2 years ago
Super Daniel 6135e178b3
[fx] refactor code for profiler / enable fake tensor movement. (#1646)
2 years ago
Boyuan Yao 5d0fdb9cb4
[fx] fix offload codegen test (#1648)
2 years ago
Frank Lee 45b39a692a
[autoparallel] implemented linear projection strategy generator (#1639)
2 years ago
Frank Lee 154d3ef432
[fix] fixed the collective pattern name for consistency (#1649)
2 years ago
YuliangLiu0306 b2b2a4af98
[autoparallel] adapt solver with mlp (#1638)
2 years ago
Jiarui Fang 04443605a5
[embedding] non-blocking cpu-gpu copy (#1647)
2 years ago
CsRic 0767f67a0f
[embedding] isolate cache_op from forward (#1645)
2 years ago
Jiarui Fang c5d39215f6
Revert "[feature] new zero implementation (#1623)" (#1643)
2 years ago
HELSON 5be118f405
[feature] new zero implementation (#1623)
2 years ago
Boyuan Yao f921733621
[autoparallel] Add pofo sequence annotation (#1637)
2 years ago
Super Daniel 04bbabeea8
[fx/profiler] provide a table of summary. (#1634)
2 years ago
HELSON 95c35f73bd
[moe] initialize MoE groups by ProcessGroup (#1640)
2 years ago
Jiarui Fang e57df80325
[embeddings] cache option (#1635)
2 years ago
HELSON a088022efc
[moe] fix moe bugs (#1633)
2 years ago
YuliangLiu0306 702dbc5288
[tensor] use communication autograd func (#1617)
2 years ago
YuliangLiu0306 c7ac0f4ab2
[autoparallel] add elementwise handler (#1622)
2 years ago
YuliangLiu0306 3a46215135
[autoparallel] add embedding handler (#1620)
2 years ago
YuliangLiu0306 69448f64c4
[autoparallel] protect bcast handler from invalid strategies (#1631)
2 years ago
YuliangLiu0306 0c703189b9
[autoparallel] add layernorm handler (#1629)
2 years ago
YuliangLiu0306 bf77d3ab65
[autoparallel] recover the merged node strategy index (#1613)
2 years ago
Boyuan Yao d6b01feb66
[fx] Modify offload codegen (#1618)
2 years ago