535 Commits (cf68cc92accd5f0a2538b24e03f1f4f857b69fb9)

Author SHA1 Message Date
Boyuan Yao d6b01feb66
[fx] Modify offload codegen (#1618) 2 years ago
YuliangLiu0306 9eae855408
[hotfix] add recompile after graph manipulatation (#1621) 2 years ago
Super Daniel d967779a32
[fx/profiler] tuned the calculation of memory estimation (#1619) 2 years ago
HELSON f7f2248771
[moe] fix MoE bugs (#1628) 2 years ago
Jiarui Fang 38c68b5b9a
[embedding] rollback for better FAW performance (#1625) 2 years ago
Frank Lee d925122020
[autoparallel] added new linear module handler (#1616) 2 years ago
Kirigaya Kazuto 170fa81095
[pipeline/chimera] test chimera | fix bug of initializing (#1615) 2 years ago
Jiarui Fang 504ff1d101
[embeddings] use cache_ratio instead of cuda_row_num (#1611) 2 years ago
YuliangLiu0306 7d1bb71d5d
[fx] PoC of runtime shape consistency application (#1607) 2 years ago
YuliangLiu0306 47b11c432c
[autoparallel]add bcast matmul strategies (#1605) 2 years ago
Boyuan Yao 933b6c6367
[fx] Add pofo solver (#1608) 2 years ago
Kirigaya Kazuto edc9e419ad
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595) 2 years ago
YuliangLiu0306 eac1b79371
[autoparallel] add bcast op handler (#1600) 2 years ago
Boyuan Yao a7cda6f57d
[fx] Add offload codegen (#1598) 2 years ago
Super Daniel c8e9b2ad78
[hotfix/rotor] fix variable names (#1597) 2 years ago
YuliangLiu0306 faa23b9d9a
[autoparallel] add reshape handler (#1594) 2 years ago
Frank Lee 27fe8af60c
[autoparallel] refactored shape consistency to remove redundancy (#1591) 2 years ago
YuliangLiu0306 d164449d00
[autoparallel] add resnet autoparallel unit test and add backward weight communication cost (#1589) 2 years ago
Frank Lee 219f66c571
[autoparallel] added solver option dataclass (#1588) 2 years ago
YuliangLiu0306 82d4376c23
[autoparallel] adapt solver with resnet (#1583) 2 years ago
CsRic f3403ff98e
[embeddings] add already_split_along_rank flag for tablewise mode (#1584) 2 years ago
Boyuan Yao f3687e4ee2
[fx] Add nested checkpoint in activation checkpoint codegen (#1585) 2 years ago
アマデウス e615cfc3a8
[NFC] polish test component gpt code style (#1567) 2 years ago
Kirigaya Kazuto 6159d45417
[pipeline/tuning] improve dispatch performance both time and space cost (#1544) 2 years ago
Super Daniel 4f59693207
[fx] provide a stable but not accurate enough version of profiler. (#1547) 2 years ago
YuliangLiu0306 0908d0fc61
[autoparallel]add backward cost info into strategies (#1524) 2 years ago
YuliangLiu0306 44c866a3e3
[autoparallel] change the merge node logic (#1533) 2 years ago
Jiarui Fang 64169f3e8f
[embedding] polish parallel embedding tablewise (#1545) 2 years ago
CsRic 964123ae0f
[embedding] freq_aware_embedding: add small functions for caller application (#1537) 2 years ago
Boyuan Yao 56159049e8
[fx] Modify solver linearize and add corresponding test (#1531) 2 years ago
Super Daniel 7dc53237c3
[fx] add test for meta tensor. (#1527) 2 years ago
YuliangLiu0306 4b3d6caeb3
[fx]patch nn.functional convolution (#1528) 2 years ago
CsRic 5156d5b4f8
[embedding] add tablewise sharding for FAW (#1526) 2 years ago
Kirigaya Kazuto f1e1836218
[pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508) 2 years ago
Boyuan Yao b231430bcb
[fx] Fix wrong index in annotation and minimal flops in ckpt solver (#1521) 2 years ago
YuliangLiu0306 3345c6d352
[autoparellel]add strategies constructor (#1505) 2 years ago
Frank Lee a0436a62ee
[autoparallel] added liveness analysis (#1516) 2 years ago
Jiarui Fang 9a9ef65313
[FAW] cpu caching operations (#1520) 2 years ago
Jiarui Fang af5438caa2
[FAW] refactor reorder() for CachedParamMgr (#1514) 2 years ago
CsRic 1b8fee8e9c
[FAW] shrink freq_cnter size (#1509) 2 years ago
Boyuan Yao 4acc58ee20
[fx] Fix activation codegen dealing with checkpointing first op (#1510) 2 years ago
Kirigaya Kazuto 5a6fd71f90
[pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497) 2 years ago
CsRic 0ed2f46131
[FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats (#1494) 2 years ago
YuliangLiu0306 8b7d6bd5be
[autoparallel] add more sharding strategies to conv (#1487) 2 years ago
Boyuan Yao de1e716dc4
[fx] Add activation checkpoint solver rotor (#1496) 2 years ago
YuliangLiu0306 413c053453
[autoparallel] add cost graph class (#1481) 2 years ago
YuliangLiu0306 4b03c25f85
[tensor]add 1D device mesh (#1492) 2 years ago
CsRic b8d0e39eaf
[FAW] LFU cache for the FAW 2 years ago
Kirigaya Kazuto 9145aef2b4
[pipeline/rpc] implement distributed optimizer | test with assert_close (#1486) 2 years ago
Frank Lee 3da68d6b1b
[fx] fixed adapative pooling size concatenation error (#1489) 2 years ago