Commit Graph

534 Commits (11ee8ae478cb2d6e4adcb9668b2abe0d3eba7aca)

Author SHA1 Message Date
Kai Wang (Victor Kai) b38efe4e8a [NFC] polish test_2p5d/checks_2p5d/check_operation_2p5d.py code style (#1718)
2 years ago
binmakeswell f6389d0813 [NFC] polish tests/test_layers/test_2d/checks_2d/check_operation_2d.py code style (#1715)
2 years ago
HELSON f69f9bf223
[zero] add chunk init function for users (#1729)
2 years ago
Super Daniel 393f594051
[fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710)
2 years ago
Frank Lee e8d8eda5e7
[autoparallel] moved tests to test_tensor_shard (#1713)
2 years ago
YuliangLiu0306 845ff4a47a
[autoparallel] resnet block runtime apply (#1709)
2 years ago
Frank Lee 22a115406b
[autoparallel] fixed broken node handler tests (#1708)
2 years ago
HELSON 1468e4bcfc
[zero] add constant placement policy (#1705)
2 years ago
Frank Lee 6c331a5a09
[autoparallel] refactored the autoparallel module for organization (#1706)
2 years ago
Frank Lee 91cd34e6e0
[unittest] added doc for the pytest wrapper (#1704)
2 years ago
YuliangLiu0306 451cd72dea
[autoparallel] adapt runtime passes (#1703)
2 years ago
Jiarui Fang 21962e1593
[embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699)
2 years ago
Frank Lee 0e52f3d3d5
[unittest] supported condititonal testing based on env var (#1701)
2 years ago
Frank Lee 8283e95db3
[autoparallel] collated all deprecated files (#1700)
2 years ago
YuliangLiu0306 81f7530ee7
[autoparallel] adapt solver and CostGraph with new handler (#1695)
2 years ago
YuliangLiu0306 42b882ef06
[autoparallel] add output handler and placeholder handler (#1694)
2 years ago
YuliangLiu0306 56088e6d98
[autoparallel] add pooling handler (#1690)
2 years ago
YuliangLiu0306 319d654f79
[autoparallel] where_handler_v2 (#1688)
2 years ago
Boyuan Yao 31d2f03d27
[autoparallel] fix C version rotor inconsistency (#1691)
2 years ago
Frank Lee 4973157ad7
[autoparallel] added sharding spec conversion for linear handler (#1687)
2 years ago
YuliangLiu0306 af718e83f2
[autoparallel] add reshape handler v2 and fix some previous bug (#1683)
2 years ago
Super Daniel 3dd6994427
[fx/profiler] assigned UUID to each unrecorded tensor/ improved performance on GPT-2 (#1679)
2 years ago
YuliangLiu0306 517b63939a
[autoparallel] add unary element wise handler v2 (#1674)
2 years ago
YuliangLiu0306 f6c6a932b8
[autoparallel] add following node generator (#1673)
2 years ago
YuliangLiu0306 52fda88796
[autoparallel] add layer norm handler v2 (#1671)
2 years ago
HELSON b28991dd0a
[feature] A new ZeRO implementation (#1644)
2 years ago
Boyuan Yao 1df98d5b66
[autoparallel] add rotor C version (#1658)
2 years ago
YuliangLiu0306 11ec070e53
[hotfix]unit test (#1670)
2 years ago
Frank Lee a60024e77a
[autoparallel] added utils for broadcast operation (#1665)
2 years ago
YuliangLiu0306 3f068d1409
[autoparallel] update CommSpec (#1667)
2 years ago
YuliangLiu0306 746f8f979d
[autoparallel] add batch norm handler v2 (#1666)
2 years ago
Kirigaya Kazuto 9708638ded
[pipeline/pytree] add pytree to process args and kwargs | provide `data_process_func` to process args and kwargs after forward (#1642)
2 years ago
Frank Lee 3a4d6f63a8
[autoparallel] added node handler for bmm (#1655)
2 years ago
YuliangLiu0306 095854477f
[autoparallel] add conv handler v2 (#1663)
2 years ago
YuliangLiu0306 1e7816a460
[autoparallel] adapt solver with gpt (#1653)
2 years ago
Frank Lee 30e50c8b4a
[autoparallel] implemented all matmul strategy generator (#1650)
2 years ago
YuliangLiu0306 03978aad45
[autoparallel] change the following nodes strategies generation logic (#1636)
2 years ago
YuliangLiu0306 59f100510a
[autoparallel] where handler (#1651)
2 years ago
Boyuan Yao 5d0fdb9cb4
[fx] fix offload codegen test (#1648)
2 years ago
Frank Lee 45b39a692a
[autoparallel] implemented linear projection strategy generator (#1639)
2 years ago
Frank Lee 154d3ef432
[fix] fixed the collective pattern name for consistency (#1649)
2 years ago
YuliangLiu0306 b2b2a4af98
[autoparallel] adapt solver with mlp (#1638)
2 years ago
Jiarui Fang c5d39215f6
Revert "[feature] new zero implementation (#1623)" (#1643)
2 years ago
HELSON 5be118f405
[feature] new zero implementation (#1623)
2 years ago
HELSON 95c35f73bd
[moe] initialize MoE groups by ProcessGroup (#1640)
2 years ago
HELSON a088022efc
[moe] fix moe bugs (#1633)
2 years ago
YuliangLiu0306 702dbc5288
[tensor] use communication autograd func (#1617)
2 years ago
YuliangLiu0306 0c703189b9
[autoparallel] add layernorm handler (#1629)
2 years ago
YuliangLiu0306 bf77d3ab65
[autoparallel] recover the merged node strategy index (#1613)
2 years ago
Boyuan Yao d6b01feb66
[fx] Modify offload codegen (#1618)
2 years ago
YuliangLiu0306 9eae855408
[hotfix] add recompile after graph manipulatation (#1621)
2 years ago
Super Daniel d967779a32
[fx/profiler] tuned the calculation of memory estimation (#1619)
2 years ago
HELSON f7f2248771
[moe] fix MoE bugs (#1628)
2 years ago
Jiarui Fang 38c68b5b9a
[embedding] rollback for better FAW performance (#1625)
2 years ago
Frank Lee d925122020
[autoparallel] added new linear module handler (#1616)
2 years ago
Kirigaya Kazuto 170fa81095
[pipeline/chimera] test chimera | fix bug of initializing (#1615)
2 years ago
Jiarui Fang 504ff1d101
[embeddings] use cache_ratio instead of cuda_row_num (#1611)
2 years ago
YuliangLiu0306 7d1bb71d5d
[fx] PoC of runtime shape consistency application (#1607)
2 years ago
YuliangLiu0306 47b11c432c
[autoparallel]add bcast matmul strategies (#1605)
2 years ago
Boyuan Yao 933b6c6367
[fx] Add pofo solver (#1608)
2 years ago
Kirigaya Kazuto edc9e419ad
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595)
2 years ago
YuliangLiu0306 eac1b79371
[autoparallel] add bcast op handler (#1600)
2 years ago
Boyuan Yao a7cda6f57d
[fx] Add offload codegen (#1598)
2 years ago
Super Daniel c8e9b2ad78
[hotfix/rotor] fix variable names (#1597)
2 years ago
YuliangLiu0306 faa23b9d9a
[autoparallel] add reshape handler (#1594)
2 years ago
Frank Lee 27fe8af60c
[autoparallel] refactored shape consistency to remove redundancy (#1591)
2 years ago
YuliangLiu0306 d164449d00
[autoparallel] add resnet autoparallel unit test and add backward weight communication cost (#1589)
2 years ago
Frank Lee 219f66c571
[autoparallel] added solver option dataclass (#1588)
2 years ago
YuliangLiu0306 82d4376c23
[autoparallel] adapt solver with resnet (#1583)
2 years ago
CsRic f3403ff98e
[embeddings] add already_split_along_rank flag for tablewise mode (#1584)
2 years ago
Boyuan Yao f3687e4ee2
[fx] Add nested checkpoint in activation checkpoint codegen (#1585)
2 years ago
アマデウス e615cfc3a8
[NFC] polish test component gpt code style (#1567)
2 years ago
Kirigaya Kazuto 6159d45417
[pipeline/tuning] improve dispatch performance both time and space cost (#1544)
2 years ago
Super Daniel 4f59693207
[fx] provide a stable but not accurate enough version of profiler. (#1547)
2 years ago
YuliangLiu0306 0908d0fc61
[autoparallel]add backward cost info into strategies (#1524)
2 years ago
YuliangLiu0306 44c866a3e3
[autoparallel] change the merge node logic (#1533)
2 years ago
Jiarui Fang 64169f3e8f
[embedding] polish parallel embedding tablewise (#1545)
2 years ago
CsRic 964123ae0f
[embedding] freq_aware_embedding: add small functions for caller application (#1537)
2 years ago
Boyuan Yao 56159049e8
[fx] Modify solver linearize and add corresponding test (#1531)
2 years ago
Super Daniel 7dc53237c3
[fx] add test for meta tensor. (#1527)
2 years ago
YuliangLiu0306 4b3d6caeb3
[fx]patch nn.functional convolution (#1528)
2 years ago
CsRic 5156d5b4f8
[embedding] add tablewise sharding for FAW (#1526)
2 years ago
Kirigaya Kazuto f1e1836218
[pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508)
2 years ago
Boyuan Yao b231430bcb
[fx] Fix wrong index in annotation and minimal flops in ckpt solver (#1521)
2 years ago
YuliangLiu0306 3345c6d352
[autoparellel]add strategies constructor (#1505)
2 years ago
Frank Lee a0436a62ee
[autoparallel] added liveness analysis (#1516)
2 years ago
Jiarui Fang 9a9ef65313
[FAW] cpu caching operations (#1520)
2 years ago
Jiarui Fang af5438caa2
[FAW] refactor reorder() for CachedParamMgr (#1514)
2 years ago
CsRic 1b8fee8e9c
[FAW] shrink freq_cnter size (#1509)
2 years ago
Boyuan Yao 4acc58ee20
[fx] Fix activation codegen dealing with checkpointing first op (#1510)
2 years ago
Kirigaya Kazuto 5a6fd71f90
[pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497)
2 years ago
CsRic 0ed2f46131
[FAW] FAW embedding use LRU as eviction strategy intialized with dataset stats (#1494)
2 years ago
YuliangLiu0306 8b7d6bd5be
[autoparallel] add more sharding strategies to conv (#1487)
2 years ago
Boyuan Yao de1e716dc4
[fx] Add activation checkpoint solver rotor (#1496)
2 years ago
YuliangLiu0306 413c053453
[autoparallel] add cost graph class (#1481)
2 years ago
YuliangLiu0306 4b03c25f85
[tensor]add 1D device mesh (#1492)
2 years ago
CsRic b8d0e39eaf
[FAW] LFU cache for the FAW
2 years ago
Kirigaya Kazuto 9145aef2b4
[pipeline/rpc] implement distributed optimizer | test with assert_close (#1486)
2 years ago
Frank Lee 3da68d6b1b
[fx] fixed adapative pooling size concatenation error (#1489)
2 years ago
Jiarui Fang cde7b8a5b8
[FAW] init an LFU implementation for FAW (#1488)
2 years ago