Jiarui Fang
cc0ed7cf33
[Gemini] ZeROHookV2 -> GeminiZeROHook ( #1972 )
2 years ago
ver217
f8a7148dec
[kernel] move all symlinks of kernel to `colossalai._C` ( #1971 )
2 years ago
Jiarui Fang
7e24b9b9ee
[Gemini] clean no used MemTraceOp ( #1970 )
2 years ago
Boyuan Yao
7c7921f71b
[autoparallel] add torch.nn.ReLU metainfo ( #1868 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
* [fx] add relu metainfo class
* [fx] restore profiler
* [autoparallel] modify metainfo input
2 years ago
Jiarui Fang
8c66a1d0aa
[polish] remove useless file _mem_tracer_hook.py ( #1963 )
2 years ago
Jiarui Fang
c4739a725a
[Gemini] polish memstats collector ( #1962 )
2 years ago
YuliangLiu0306
fea3cb661c
[autoparallel] support addmm in tracer and solver ( #1961 )
...
* [fx] patch addmm
* [autoparallel] support addmm in tracer and solver
2 years ago
Jiarui Fang
f7e276fa71
[Gemini] add GeminiAdamOptimizer ( #1960 )
2 years ago
HELSON
7066dfbf82
[zero] fix memory leak for zero2 ( #1955 )
2 years ago
Jiarui Fang
52c6ad26e0
[ColoTensor] reconfig ColoInitContext, decouple default_pg and default_dist_spec. ( #1953 )
2 years ago
zbian
598d456d0e
fixed logger
2 years ago
zbian
6877121377
updated flash attention api
2 years ago
YuliangLiu0306
36c0f3ea5b
[autoparallel] remove redundancy comm node ( #1893 )
2 years ago
アマデウス
e52f9d9109
[tensorparallel] fixed tp layers ( #1938 )
2 years ago
Jiarui Fang
9f4fb3f28a
[ColoTensor] ColoInitContext initialize parameters in shard mode. ( #1937 )
2 years ago
Boyuan Yao
d5c5bc219e
[SC] add GPT example for auto checkpoint ( #1889 )
...
* [sc] SC tutorial for auto checkpoint
* [sc] polish examples
* [sc] polish readme
* [sc] polish readme and help information
* [sc] polish readme and help information
2 years ago
Junming Wu
14a0b18305
[NFC] polish colossalai/amp/naive_amp/__init__.py code style ( #1905 )
2 years ago
HELSON
6e51d296f0
[zero] migrate zero1&2 ( #1878 )
...
* add zero1&2 optimizer
* rename test ditectory
* rename test files
* change tolerance in test
2 years ago
Super Daniel
cc55ff0aa4
[autoparallel] user-friendly API for CheckpointSolver. ( #1879 )
...
Merge for SC tutorial
2 years ago
Super Daniel
448248b27c
[fx] metainfo_trace as an API. ( #1873 )
...
* [fx] metainfo_trace as an API.
* [fx] add return.
2 years ago
Jiarui Fang
986f8cbaa7
[inference] overlap comm and compute in Linear1D_Row when stream_chunk_num > 1 ( #1876 )
2 years ago
YuliangLiu0306
1b494ad73c
[autoparallel] fix linear logical convert issue ( #1857 )
2 years ago
Jiarui Fang
c2947dadf1
[inference] streaming Linear 1D Row inference ( #1874 )
2 years ago
Frank Lee
e6ec99d389
[utils] fixed lazy init context ( #1867 )
2 years ago
zbian
653b0a620e
added skip_bias_add for non-tp linear
2 years ago
LuGY
94329fc139
[NFC] polish colossalai/amp/apex_amp/__init__.py code style ( #1853 )
2 years ago
zbian
1559a09fb7
[NFC] polish amp.naive_amp.grad_scaler code style
2 years ago
HELSON
72c9448920
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/operator_handler.py code style ( #1845 )
2 years ago
Genghan Zhang
b25030cc07
[NFC] polish ./colossalai/amp/torch_amp/__init__.py code style ( #1836 )
2 years ago
Sze-qq
95ac4f88ea
[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/conv_handler.py code style ( #1829 )
...
Co-authored-by: siqi <siqi@siqis-MacBook-Pro.local>
2 years ago
Ziyue Jiang
5da03c936d
[NFC] polish colossalai/amp/torch_amp/_grad_scaler.py code style ( #1823 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
Fazzie-Maqianli
399f84d8f6
[NFC] polish colossalai/amp/naive_amp/_fp16_optimizer.py code style ( #1819 )
2 years ago
CsRic
9623ec1b02
[NFC] polish colossalai/amp/naive_amp/_utils.py code style ( #1816 )
...
* [NFC] polish colossalai/nn/metric/accuracy_2p5d.py code style (#1714 )
* [NFC] polish colossalai/zero/sharded_param/__init__.py code style
* [NFC] polish colossalai/amp/naive_amp/_utils.py code style
Co-authored-by: shenggan <csg19971016@gmail.com>
Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>
2 years ago
binmakeswell
3c3714fc2a
[NFC] polish strategies_constructor.py code style ( #1806 )
2 years ago
Jiarui Fang
3ce4463fe6
[utils] remove lazy_memory_allocate from ColoInitContext ( #1844 )
2 years ago
Jiarui Fang
fba34efb5a
version to 0.1.11rc2 ( #1832 )
2 years ago
YuliangLiu0306
49216d7ab1
[autoparallel] fix bugs caused by negative dim key ( #1808 )
...
* [autoparallel] fix bugs caused by negative dim key
* fix import error
* fix matmul test issue
* fix unit test issue
2 years ago
アマデウス
4268ae017b
[kernel] added jit warmup ( #1792 )
2 years ago
YuliangLiu0306
f6032ddb17
[autoparallel] fix bias addition module ( #1800 )
2 years ago
Jiarui Fang
cd5a0d56fa
[Gemini] make gemini usage simple ( #1821 )
2 years ago
ver217
99870726b1
[CheckpointIO] a uniform checkpoint I/O module ( #1689 )
2 years ago
Boyuan Yao
629172b319
[autoparallel] add batch norm metainfo ( #1815 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
* [autoparallel] add batchnorm metainfo class
* [autoparallel] fix batchnorm unit test function declaration
* [fx] restore profiler
2 years ago
Super Daniel
441d584e4a
[fx] add a symbolic_trace api. ( #1812 )
...
* [fx] add a symbolic_trace api.
* [fx] fix import errors.
2 years ago
xcnick
e0da01ea71
[hotfix] fix build error when torch version >= 1.13 ( #1803 )
2 years ago
oahzxl
9639ea88fc
[kernel] more flexible flashatt interface ( #1804 )
2 years ago
Zihao
20e255d4e8
MemStatsCollectorStatic ( #1765 )
2 years ago
Boyuan Yao
327d07c44a
[autoparallel] add conv metainfo class for auto parallel ( #1796 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
* [fx] add conv metainfo class
* [fx] restore profiler
* [fx] restore meta profiler
* [autoparallel] modify unit test
* [fx] modify unit test
2 years ago
oahzxl
501a9e9cd2
[hotfix] polish flash attention ( #1802 )
2 years ago
Jiarui Fang
218c75fd9d
[NFC] polish type hint for shape consistency ( #1801 )
...
* [NFC] polish type hint for shape consistency
* polish code
* polish code
2 years ago
Jiarui Fang
c248800359
[kernel] skip tests of flash_attn and triton when they are not available ( #1798 )
2 years ago
YuliangLiu0306
e34e850a4c
[autoparallel]add essential CommActions for broadcast oprands ( #1793 )
2 years ago
Boyuan Yao
05ce3d369f
[fx] Add linear metainfo class for auto parallel ( #1783 )
...
* [fx] metainfo class for auto parallel
* [fx] add unit test for linear metainfo
* [fx] fix bwd param for linear
* [fx] modify unit test
* [fx] modify unit test
* [fx] modify import
* [fx] modify import
* [fx] modify import
* [fx] move meta profiler to auto parallel
2 years ago
Super Daniel
e8a9bebc87
[autoparallel] refactor and add rotorc. ( #1789 )
...
* [autoparallel] refactor and add rotorc.
* [autoparallel] refactor and add rotorc.
2 years ago
YuliangLiu0306
2c4c7b3618
[autoparallel] add getattr handler ( #1767 )
...
* [autoparallel] add getattr haandler
* polish code
* add extra processes for Parameters
* add unit test for param resharding cost
* add docstring and polish test
2 years ago
HELSON
c6a1a62636
[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 ( #1786 )
...
* [hotfix] fix zero's incompatibility with checkpoint in torch-1.12
* [zero] add cpu shard init
* [zero] add tiny example test
* [colo_tensor] fix bugs for torch-1.11
2 years ago
kurisusnowdeng
0b8161fab8
updated tp layers
2 years ago
Jiarui Fang
cb5a587e9a
[hotfix] polish chunk import ( #1787 )
2 years ago
YuliangLiu0306
e859380bf7
[fx] support module with bias addition ( #1780 )
...
* [autoparallel] refactor tracer to fix bias addition issue
* [fx] support module with bias addition
* create bias_addition_module
* refactor file structure
* polish code
* fix unit test
2 years ago
Frank Lee
f3f19a5c47
[autoparallel] added matmul handler ( #1763 )
...
* [autoparallel] added matmul handler
* polish code
2 years ago
Ziyue Jiang
4df0194976
[Pipeline]Adapt to Pipelinable OPT ( #1782 )
2 years ago
YuliangLiu0306
27de252334
[autoparallel] fix conv handler numerical test ( #1771 )
2 years ago
Super Daniel
1e88811c7a
[autoparallel] move ckpt solvers to autoparallel folder / refactor code ( #1764 )
...
* [autoparallel] first move.
* [autoparallel] add solver rotor.
* [autoparallel] add ckpt solvers.
* [autoparallel] modify codegen.
* [fx] fix annotation in test.
* [fx] remove check.
* [autoparallel] polish docstring.
* [fx] refactor MetaTensor.
2 years ago
Jiarui Fang
f34dab4270
[compatibility] ChunkMgr import error ( #1772 )
2 years ago
YuliangLiu0306
b0f7c8bde8
[autoparallel] update CommSpec to CommActions ( #1768 )
...
* [autoparallel] update CommSpec to CommActions
* polish code
2 years ago
YuliangLiu0306
b4cc59b61e
[autoparallel] add numerical test for node strategies ( #1760 )
...
* [autoparallel] add numerical test for node strategies
* polish code
* polish code
2 years ago
oahzxl
25952b67d7
[feat] add flash attention ( #1762 )
2 years ago
Super Daniel
0584654c79
[fx] refactor memory utils and extend shard utils. ( #1754 )
...
* [fx] change memory.py to memory_utils.py.
* [fx] add shard utils.
* [fx] fix import.
* [fx] check code style.
* [fx] add comment.
* [autoparallel] first move.
* [fx] add time computations.
2 years ago
Ziyue Jiang
63f250bbd4
fix file name ( #1759 )
...
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
2 years ago
YuliangLiu0306
314d8c497f
[autoparallel] refactor the runtime apply pass and add docstring to passes ( #1757 )
...
* [autoparallel] refactor the runtime apply pass and add doc string to passes
* fix unit test
* polish
2 years ago
Frank Lee
f9a613d660
[autoparallel] added binary elementwise node handler ( #1758 )
...
* [autoparallel] added binary elementwise node handler
* polish code
2 years ago
YuliangLiu0306
d2fc067231
[autoparallel] fix param hook issue in transform pass ( #1755 )
2 years ago
Frank Lee
262652c8bc
[autoparallel] added addbmm handler ( #1751 )
2 years ago
YuliangLiu0306
980ed21723
[autoparallel] shard param and buffer as expected ( #1753 )
...
* [autoparallel] shard param and buffer as expected
* fix unit test issue
2 years ago
YuliangLiu0306
cdb7d5e7d2
[hotfix] autoparallel unit test ( #1752 )
2 years ago
YuliangLiu0306
a4ce180e85
[autoparallel] add sequential order to communication actions ( #1735 )
2 years ago
Frank Lee
474111ecb5
[autoparallel] fixed wrong sharding strategy in conv handler ( #1747 )
...
* [autoparallel] fixed wrong sharding strategy in conv handler
* polish code
2 years ago
Frank Lee
8b8937d901
[autoparallel] fixed wrong generated strategy for dot op ( #1746 )
...
* [autoparallel] fixed wrong generated strategy for dot op
* polish code
2 years ago
Frank Lee
993b8875b6
[autoparallel] handled illegal sharding strategy in shape consistency ( #1744 )
...
* [autoparallel] handled illegal sharding strategy in shape consistency
* polish code
2 years ago
Frank Lee
88a79814fb
[autoparallel] handled illegal strategy in node handler ( #1743 )
...
* [autoparallel] handled illegal strategy in node handler
* polish code
2 years ago
Super Daniel
30874f1692
[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler ( #1730 )
...
* [fx/profiler] add test.
* [fx] fix file names.
* [fx] add docstring and comment.
* [fx] polish profiler.py.
* [fx] fix import errors.
* [fx] fix profiler.
* [fx] fix names.
2 years ago
Frank Lee
eee84908d4
[autoparallel] handled illegal sharding strategy ( #1728 )
...
* [autoparallel] handled illegal sharding strategy
* polish code
2 years ago
Sze-qq
23703c9dd6
[NFC] polish colossalai/nn/metric/_utils.py code style ( #1727 )
2 years ago
Ofey Chan
7e62af28a0
[NFC] polish accuracy_2d.py code style ( #1719 )
2 years ago
LuGY
730f88f8e1
[NFC] polish _checkpoint_hook.py code style ( #1722 )
2 years ago
CsRic
ea961d8fd1
[NFC] polish colossalai/zero/sharded_param/__init__.py code style ( #1717 )
...
Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>
2 years ago
yuxuan-lou
2b49ca80a3
[NFC] polish colossalai/nn/lr_scheduler/linear.py code style ( #1716 )
2 years ago
shenggan
e1d780030d
[NFC] polish colossalai/nn/metric/accuracy_2p5d.py code style ( #1714 )
2 years ago
YuliangLiu0306
d373e67b99
[hotfix] resharding cost issue ( #1742 )
2 years ago
Jiarui Fang
24e84eba60
upgrade version to 0.1.11rc1 ( #1739 )
2 years ago
Frank Lee
d2e0e39c9d
[release] update to v0.1.11 ( #1736 )
2 years ago
HELSON
f69f9bf223
[zero] add chunk init function for users ( #1729 )
...
* add chunk manager init function
* fix unit tests
* add comment
* add flush=True
2 years ago
YuliangLiu0306
51b89d2202
[autoparallel] runtime_backward_apply ( #1720 )
2 years ago
Super Daniel
393f594051
[fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug ( #1710 )
...
* [fx] move meta registration
* [fx] fix tests.
* [fx] fix test.
* [fx] fix.
* [meta] refactor meta registration.py.
* [fx] add compatibility descriptions.
* [fx] polish import.
* [fx] add a decorator.
* [fx] fix tests.
* [fx] remove print.
* [fx] edit raise error.
* [fx] edit raise error.
* [fx] add type hint.
* [fx] fix import in experimental.
* [rpc] remove color debug.
* [meta] fix naming.
2 years ago
YuliangLiu0306
845ff4a47a
[autoparallel] resnet block runtime apply ( #1709 )
...
* [autoparallel] resnet block runtime apply
* seperate buffer and parameter in MemoryCost
* polish code
* add comments and todos
* fix test issue
2 years ago
Frank Lee
22a115406b
[autoparallel] fixed broken node handler tests ( #1708 )
2 years ago
HELSON
1468e4bcfc
[zero] add constant placement policy ( #1705 )
...
* fixes memory leak when paramter is in fp16 in ZeroDDP init.
* bans chunk releasement in CUDA. Only when a chunk is about to offload, it is allowed to release.
* adds a constant placement policy. With it, users can allocate a reserved caching memory space for parameters.
2 years ago
binmakeswell
5f41463a76
add optimizer README for tutorials ( #1707 )
2 years ago
Frank Lee
6c331a5a09
[autoparallel] refactored the autoparallel module for organization ( #1706 )
...
* [autoparallel] refactored the autoparallel module for organization
* polish code
2 years ago
Frank Lee
91cd34e6e0
[unittest] added doc for the pytest wrapper ( #1704 )
2 years ago
YuliangLiu0306
451cd72dea
[autoparallel] adapt runtime passes ( #1703 )
...
* [autoparallel] adapt runtime passes v2
* polish code
2 years ago
Jiarui Fang
21962e1593
[embedding] rename FreqAwareEmbedding -> CachedEmbedding ( #1699 )
2 years ago
Frank Lee
0e52f3d3d5
[unittest] supported condititonal testing based on env var ( #1701 )
...
polish code
2 years ago
Frank Lee
8283e95db3
[autoparallel] collated all deprecated files ( #1700 )
...
* [autoparallel] collated all deprecated files
* polish code
2 years ago
Frank Lee
e2355d01b9
[autoparallel] init new folder structure ( #1696 )
2 years ago
YuliangLiu0306
81f7530ee7
[autoparallel] adapt solver and CostGraph with new handler ( #1695 )
...
* [autoparallel] adapt solver and CostGraph with new handler
* fix test issue
2 years ago
YuliangLiu0306
42b882ef06
[autoparallel] add output handler and placeholder handler ( #1694 )
...
* [autoparallel] add output handler and placeholder handler
* Delete test_solver_with_resnet.py
* fix test bugs
2 years ago
YuliangLiu0306
56088e6d98
[autoparallel] add pooling handler ( #1690 )
...
* [autoparallel] add pooling handler
* polish code
2 years ago
YuliangLiu0306
319d654f79
[autoparallel] where_handler_v2 ( #1688 )
...
* where generator
* [autoparallel] where_handler_v2
2 years ago
Boyuan Yao
31d2f03d27
[autoparallel] fix C version rotor inconsistency ( #1691 )
2 years ago
Jiarui Fang
363fc2861a
[embeddings] more detailed timer ( #1692 )
2 years ago
Frank Lee
4973157ad7
[autoparallel] added sharding spec conversion for linear handler ( #1687 )
2 years ago
YuliangLiu0306
af718e83f2
[autoparallel] add reshape handler v2 and fix some previous bug ( #1683 )
2 years ago
YuliangLiu0306
6878e42248
[hotfix] solver bug caused by dict type comm cost ( #1686 )
2 years ago
Super Daniel
3dd6994427
[fx/profiler] assigned UUID to each unrecorded tensor/ improved performance on GPT-2 ( #1679 )
...
* [fx/profiler] modify data_ptr into uuid for all tensors.
* [fx] modify uuid.
* [fx/profiler] tune performance on GPT-2.
* [fx] updates.
* [fx] debug.
* [fx] debug.
* [fx] cuda.
2 years ago
Kirigaya Kazuto
0df5034a36
[pipeline/fix-bug] num_microbatches support any integrate | stable chimera | launch tool for rpc pp framework ( #1684 )
...
* [pipeline/tuning] improve dispatch performance both time and space cost
* [pipeline/converge] add interface for testing convergence
* [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style
* Update PipelineBase.py
* [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera
* [pipeline/chimera] test chimera | fix bug of initializing
* [pipeline/pytree] add pytree to process args and kwargs | provide to process args and kwargs after forward
* [pipeline/fix-bug] num_microbatches support any integrate | stable chimera | launch tool for rpc pp framework
2 years ago
jim
e5ab6be72e
[hotfix[ fix colotensor.type() raise NotImplementedError ( #1682 )
2 years ago
Kirigaya Kazuto
3b2a59b0ba
[pipeline/rank_recorder] fix bug when process data before backward | add a tool for multiple ranks debug ( #1681 )
...
* [pipeline/tuning] improve dispatch performance both time and space cost
* [pipeline/converge] add interface for testing convergence
* [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style
* Update PipelineBase.py
* [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera
* [pipeline/chimera] test chimera | fix bug of initializing
* [pipeline/pytree] add pytree to process args and kwargs | provide to process args and kwargs after forward
2 years ago
YuliangLiu0306
517b63939a
[autoparallel] add unary element wise handler v2 ( #1674 )
2 years ago
YuliangLiu0306
f6c6a932b8
[autoparallel] add following node generator ( #1673 )
...
* [autoparallel] add following node generator
* polish code
* polish code
* update name of arguments
2 years ago
YuliangLiu0306
52fda88796
[autoparallel] add layer norm handler v2 ( #1671 )
...
* [autoparallel] add layer norm handler v2
* polish code
* polish code
2 years ago
Fazzie-Maqianli
87c5ad352a
update version to 0.1.10 ( #1676 )
2 years ago
HELSON
b28991dd0a
[feature] A new ZeRO implementation ( #1644 )
2 years ago
Boyuan Yao
b1be5b88bd
[autoparallel] fix insecure subprocess ( #1680 )
...
* [autoparallel] fix insecure subprocess
* [fx] fix insecure subprocess
2 years ago
Boyuan Yao
d8420f81a4
[hotfix] fix wrong type name in profiler ( #1678 )
2 years ago
Boyuan Yao
132b4306b7
[fx] Add concrete info prop ( #1677 )
...
* [fx] concreteinfoprop
* [fx] add concreteinfoprop
* [fx] modify docstring of ConcreteInfoProp
* [fx] fix device error
* [fx] modify parameter calculation
* [fx] modify parameters calculation
2 years ago
Boyuan Yao
1df98d5b66
[autoparallel] add rotor C version ( #1658 )
...
* [autoparallel] add rotor c version
* [fx] remove metainfoprop in rotor solver
* [autoparallel] modify C
code format
* [autoparallel] remove build.py
* [autoparallel] fix C extension build
* [autoparallel] add C solver consistency test
* [autoparallel] remove some unused imports
* [autoparallel] refactor rotor solver code
* [autoparallel] replace print with colossalai logger
* [autoparallel] ranks fixed
2 years ago
YuliangLiu0306
11ec070e53
[hotfix]unit test ( #1670 )
2 years ago
Frank Lee
a60024e77a
[autoparallel] added utils for broadcast operation ( #1665 )
...
* [autoparallel] added utils for broadcast operation
* polish code
2 years ago
YuliangLiu0306
3f068d1409
[autoparallel] update CommSpec ( #1667 )
2 years ago
Frank Lee
247a9dbca9
[autoparallel] added bias comm spec to matmul strategy ( #1664 )
2 years ago
YuliangLiu0306
746f8f979d
[autoparallel] add batch norm handler v2 ( #1666 )
2 years ago
Kirigaya Kazuto
9708638ded
[pipeline/pytree] add pytree to process args and kwargs | provide `data_process_func` to process args and kwargs after forward ( #1642 )
...
* [pipeline/tuning] improve dispatch performance both time and space cost
* [pipeline/converge] add interface for testing convergence
* [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style
* Update PipelineBase.py
* [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera
* [pipeline/chimera] test chimera | fix bug of initializing
* [pipeline/pytree] add pytree to process args and kwargs | provide to process args and kwargs after forward
2 years ago
YuliangLiu0306
c27e701cb2
[autoparallel] remove no strategy nodes ( #1652 )
...
* [autoparallel] remove no strategy nodes
* fix none object iteration issue
2 years ago
Frank Lee
50f16a2850
[autoparallel] added compute resharding costs for node handler ( #1662 )
2 years ago
Frank Lee
9ec401a722
[autoparallel] added new strategy constructor template ( #1661 )
...
* [autoparallel] added new strategy constructor template
* polish code
2 years ago
Frank Lee
3a4d6f63a8
[autoparallel] added node handler for bmm ( #1655 )
2 years ago
YuliangLiu0306
095854477f
[autoparallel] add conv handler v2 ( #1663 )
2 years ago
YuliangLiu0306
1e7816a460
[autoparallel] adapt solver with gpt ( #1653 )
2 years ago
Jiarui Fang
c638bec028
[embedding] polish async copy ( #1657 )
2 years ago
Jiarui Fang
988570e4a6
[embedding] add more detail profiling ( #1656 )
2 years ago
Jiarui Fang
e1f97fd2b8
[embedding] print profiling results ( #1654 )
2 years ago
Frank Lee
30e50c8b4a
[autoparallel] implemented all matmul strategy generator ( #1650 )
2 years ago
YuliangLiu0306
03978aad45
[autoparallel] change the following nodes strategies generation logic ( #1636 )
...
* [autoparallel] change the following nodes strategies generation logic
* fix unit test
2 years ago
YuliangLiu0306
59f100510a
[autoparallel] where handler ( #1651 )
...
* [autoparallel] where handler
* fix unit test
2 years ago
Super Daniel
6135e178b3
[fx] refactor code for profiler / enable fake tensor movement. ( #1646 )
...
* [fx/profiling] provide summary for MetaInfoProp.
* [fx/profiler] provide a table of summary.
* [fx/profiler] provide a table of summary.
* [fx/profiler] provide a table of summary.
* [fx/profiler] provide a table of summary.
* [fx] optimize table repr.
* [fx] optimize table repr.
* [fx] refactor code for profiler.
* [fx] add docstring.
* [fx] add docstring.
* [fx] skip test.
* [fx] redo.
* [fx] redo.
* [fx] fix import error for torch11.
* [fx] fix import error for torch11.
2 years ago
Boyuan Yao
5d0fdb9cb4
[fx] fix offload codegen test ( #1648 )
...
* [fx] fix offload codegen test
* [fx] modify typing
2 years ago
Frank Lee
45b39a692a
[autoparallel] implemented linear projection strategy generator ( #1639 )
2 years ago
Frank Lee
154d3ef432
[fix] fixed the collective pattern name for consistency ( #1649 )
...
* [fix] fixed the collective pattern name for consistency
* polish code
2 years ago
YuliangLiu0306
b2b2a4af98
[autoparallel] adapt solver with mlp ( #1638 )
2 years ago
Jiarui Fang
04443605a5
[embedding] non-blocking cpu-gpu copy ( #1647 )
2 years ago