Commit Graph

1177 Commits (51b89d2202cebf7b81aeae220aa2699d2e012543)
 

Author SHA1 Message Date
Jiarui Fang 988570e4a6
[embedding] add more detail profiling (#1656)
2 years ago
Jiarui Fang e1f97fd2b8
[embedding] print profiling results (#1654)
2 years ago
Frank Lee 30e50c8b4a
[autoparallel] implemented all matmul strategy generator (#1650)
2 years ago
YuliangLiu0306 03978aad45
[autoparallel] change the following nodes strategies generation logic (#1636)
2 years ago
YuliangLiu0306 59f100510a
[autoparallel] where handler (#1651)
2 years ago
Super Daniel 6135e178b3
[fx] refactor code for profiler / enable fake tensor movement. (#1646)
2 years ago
Boyuan Yao 5d0fdb9cb4
[fx] fix offload codegen test (#1648)
2 years ago
Frank Lee 45b39a692a
[autoparallel] implemented linear projection strategy generator (#1639)
2 years ago
Frank Lee 154d3ef432
[fix] fixed the collective pattern name for consistency (#1649)
2 years ago
YuliangLiu0306 b2b2a4af98
[autoparallel] adapt solver with mlp (#1638)
2 years ago
Jiarui Fang 04443605a5
[embedding] non-blocking cpu-gpu copy (#1647)
2 years ago
CsRic 0767f67a0f
[embedding] isolate cache_op from forward (#1645)
2 years ago
Jiarui Fang c5d39215f6
Revert "[feature] new zero implementation (#1623)" (#1643)
2 years ago
HELSON 5be118f405
[feature] new zero implementation (#1623)
2 years ago
Boyuan Yao f921733621
[autoparallel] Add pofo sequence annotation (#1637)
2 years ago
Super Daniel 04bbabeea8
[fx/profiler] provide a table of summary. (#1634)
2 years ago
HELSON 95c35f73bd
[moe] initialize MoE groups by ProcessGroup (#1640)
2 years ago
Jiarui Fang e57df80325
[embeddings] cache option (#1635)
2 years ago
HELSON a088022efc
[moe] fix moe bugs (#1633)
2 years ago
YuliangLiu0306 702dbc5288
[tensor] use communication autograd func (#1617)
2 years ago
YuliangLiu0306 c7ac0f4ab2
[autoparallel] add elementwise handler (#1622)
2 years ago
YuliangLiu0306 3a46215135
[autoparallel] add embedding handler (#1620)
2 years ago
YuliangLiu0306 69448f64c4
[autoparallel] protect bcast handler from invalid strategies (#1631)
2 years ago
YuliangLiu0306 0c703189b9
[autoparallel] add layernorm handler (#1629)
2 years ago
YuliangLiu0306 bf77d3ab65
[autoparallel] recover the merged node strategy index (#1613)
2 years ago
Boyuan Yao d6b01feb66
[fx] Modify offload codegen (#1618)
2 years ago
YuliangLiu0306 9eae855408
[hotfix] add recompile after graph manipulatation (#1621)
2 years ago
Super Daniel d967779a32
[fx/profiler] tuned the calculation of memory estimation (#1619)
2 years ago
HELSON f7f2248771
[moe] fix MoE bugs (#1628)
2 years ago
Jiarui Fang 38c68b5b9a
[embedding] rollback for better FAW performance (#1625)
2 years ago
Frank Lee d925122020
[autoparallel] added new linear module handler (#1616)
2 years ago
Kirigaya Kazuto 170fa81095
[pipeline/chimera] test chimera | fix bug of initializing (#1615)
2 years ago
Jiarui Fang 504ff1d101
[embeddings] use cache_ratio instead of cuda_row_num (#1611)
2 years ago
YuliangLiu0306 6a8f8cc05e
[hotfix] got sliced types (#1614)
2 years ago
Frank Lee d397842fa8
[autoparallel] added new node handler (#1612)
2 years ago
YuliangLiu0306 7d1bb71d5d
[fx] PoC of runtime shape consistency application (#1607)
2 years ago
YuliangLiu0306 47b11c432c
[autoparallel]add bcast matmul strategies (#1605)
2 years ago
Frank Lee edb67cb378
[autoparallel] refactored the data structure for sharding strategy (#1610)
2 years ago
Boyuan Yao 933b6c6367
[fx] Add pofo solver (#1608)
2 years ago
github-actions[bot] d32cf84c46
Automated submodule synchronization (#1609)
2 years ago
Frank Lee 725666d6a9
[workflow] deactivate conda environment before removing (#1606)
2 years ago
Kirigaya Kazuto edc9e419ad
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595)
2 years ago
ver217 c9e8ce67b8
fix move fp32 shards (#1604)
2 years ago
YuliangLiu0306 eac1b79371
[autoparallel] add bcast op handler (#1600)
2 years ago
Frank Lee 3abf98a633
[autoparallel] added all non-bcast matmul strategies (#1603)
2 years ago
Frank Lee db98b695b2
[autoparallel] added strategy generator and bmm strategies (#1602)
2 years ago
Jiarui Fang a19eb80998
[embedding] updates some default parameters
2 years ago
Super Daniel cd5cf2bcc9
[fx/tuning] tune performance on rotor with meta info. (#1599)
2 years ago
Boyuan Yao a7cda6f57d
[fx] Add offload codegen (#1598)
2 years ago
Super Daniel c8e9b2ad78
[hotfix/rotor] fix variable names (#1597)
2 years ago