1235 Commits (e8a9bebc8770b9430f4150a400e6fef43cf02d4f)
 

Author SHA1 Message Date
Kirigaya Kazuto 9708638ded
[pipeline/pytree] add pytree to process args and kwargs | provide `data_process_func` to process args and kwargs after forward (#1642) 2 years ago
YuliangLiu0306 c27e701cb2
[autoparallel] remove no strategy nodes (#1652) 2 years ago
Frank Lee 50f16a2850
[autoparallel] added compute resharding costs for node handler (#1662) 2 years ago
Frank Lee 9ec401a722
[autoparallel] added new strategy constructor template (#1661) 2 years ago
Frank Lee 3a4d6f63a8
[autoparallel] added node handler for bmm (#1655) 2 years ago
YuliangLiu0306 095854477f
[autoparallel] add conv handler v2 (#1663) 2 years ago
YuliangLiu0306 1e7816a460
[autoparallel] adapt solver with gpt (#1653) 2 years ago
Jiarui Fang c638bec028
[embedding] polish async copy (#1657) 2 years ago
Jiarui Fang 988570e4a6
[embedding] add more detail profiling (#1656) 2 years ago
Jiarui Fang e1f97fd2b8
[embedding] print profiling results (#1654) 2 years ago
Frank Lee 30e50c8b4a
[autoparallel] implemented all matmul strategy generator (#1650) 2 years ago
YuliangLiu0306 03978aad45
[autoparallel] change the following nodes strategies generation logic (#1636) 2 years ago
YuliangLiu0306 59f100510a
[autoparallel] where handler (#1651) 2 years ago
Super Daniel 6135e178b3
[fx] refactor code for profiler / enable fake tensor movement. (#1646) 2 years ago
Boyuan Yao 5d0fdb9cb4
[fx] fix offload codegen test (#1648) 2 years ago
Frank Lee 45b39a692a
[autoparallel] implemented linear projection strategy generator (#1639) 2 years ago
Frank Lee 154d3ef432
[fix] fixed the collective pattern name for consistency (#1649) 2 years ago
YuliangLiu0306 b2b2a4af98
[autoparallel] adapt solver with mlp (#1638) 2 years ago
Jiarui Fang 04443605a5
[embedding] non-blocking cpu-gpu copy (#1647) 2 years ago
CsRic 0767f67a0f
[embedding] isolate cache_op from forward (#1645) 2 years ago
Jiarui Fang c5d39215f6
Revert "[feature] new zero implementation (#1623)" (#1643) 2 years ago
HELSON 5be118f405
[feature] new zero implementation (#1623) 2 years ago
Boyuan Yao f921733621
[autoparallel] Add pofo sequence annotation (#1637) 2 years ago
Super Daniel 04bbabeea8
[fx/profiler] provide a table of summary. (#1634) 2 years ago
HELSON 95c35f73bd
[moe] initialize MoE groups by ProcessGroup (#1640) 2 years ago
Jiarui Fang e57df80325
[embeddings] cache option (#1635) 2 years ago
HELSON a088022efc
[moe] fix moe bugs (#1633) 2 years ago
YuliangLiu0306 702dbc5288
[tensor] use communication autograd func (#1617) 2 years ago
YuliangLiu0306 c7ac0f4ab2
[autoparallel] add elementwise handler (#1622) 2 years ago
YuliangLiu0306 3a46215135
[autoparallel] add embedding handler (#1620) 2 years ago
YuliangLiu0306 69448f64c4
[autoparallel] protect bcast handler from invalid strategies (#1631) 2 years ago
YuliangLiu0306 0c703189b9
[autoparallel] add layernorm handler (#1629) 2 years ago
YuliangLiu0306 bf77d3ab65
[autoparallel] recover the merged node strategy index (#1613) 2 years ago
Boyuan Yao d6b01feb66
[fx] Modify offload codegen (#1618) 2 years ago
YuliangLiu0306 9eae855408
[hotfix] add recompile after graph manipulatation (#1621) 2 years ago
Super Daniel d967779a32
[fx/profiler] tuned the calculation of memory estimation (#1619) 2 years ago
HELSON f7f2248771
[moe] fix MoE bugs (#1628) 2 years ago
Jiarui Fang 38c68b5b9a
[embedding] rollback for better FAW performance (#1625) 2 years ago
Frank Lee d925122020
[autoparallel] added new linear module handler (#1616) 2 years ago
Kirigaya Kazuto 170fa81095
[pipeline/chimera] test chimera | fix bug of initializing (#1615) 2 years ago
Jiarui Fang 504ff1d101
[embeddings] use cache_ratio instead of cuda_row_num (#1611) 2 years ago
YuliangLiu0306 6a8f8cc05e
[hotfix] got sliced types (#1614) 2 years ago
Frank Lee d397842fa8
[autoparallel] added new node handler (#1612) 2 years ago
YuliangLiu0306 7d1bb71d5d
[fx] PoC of runtime shape consistency application (#1607) 2 years ago
YuliangLiu0306 47b11c432c
[autoparallel]add bcast matmul strategies (#1605) 2 years ago
Frank Lee edb67cb378
[autoparallel] refactored the data structure for sharding strategy (#1610) 2 years ago
Boyuan Yao 933b6c6367
[fx] Add pofo solver (#1608) 2 years ago
github-actions[bot] d32cf84c46
Automated submodule synchronization (#1609) 2 years ago
Frank Lee 725666d6a9
[workflow] deactivate conda environment before removing (#1606) 2 years ago
Kirigaya Kazuto edc9e419ad
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595) 2 years ago