ColossalAI

Commit Graph

Author	SHA1	Message	Date
YuliangLiu0306	2c4c7b3618	[autoparallel] add getattr handler (#1767 ) * [autoparallel] add getattr haandler * polish code * add extra processes for Parameters * add unit test for param resharding cost * add docstring and polish test	2022-11-03 12:31:33 +08:00
HELSON	c6a1a62636	[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786 ) * [hotfix] fix zero's incompatibility with checkpoint in torch-1.12 * [zero] add cpu shard init * [zero] add tiny example test * [colo_tensor] fix bugs for torch-1.11	2022-11-02 16:11:34 +08:00
kurisusnowdeng	0b8161fab8	updated tp layers	2022-11-02 12:19:38 +08:00
Jiarui Fang	cb5a587e9a	[hotfix] polish chunk import (#1787 )	2022-11-02 12:10:52 +08:00
YuliangLiu0306	e859380bf7	[fx] support module with bias addition (#1780 ) * [autoparallel] refactor tracer to fix bias addition issue * [fx] support module with bias addition * create bias_addition_module * refactor file structure * polish code * fix unit test	2022-11-01 22:53:51 +08:00
Frank Lee	f3f19a5c47	[autoparallel] added matmul handler (#1763 ) * [autoparallel] added matmul handler * polish code	2022-11-01 15:14:53 +08:00
Ziyue Jiang	4df0194976	[Pipeline]Adapt to Pipelinable OPT (#1782 )	2022-11-01 14:18:50 +08:00
YuliangLiu0306	27de252334	[autoparallel] fix conv handler numerical test (#1771 )	2022-11-01 10:43:44 +08:00
Super Daniel	1e88811c7a	[autoparallel] move ckpt solvers to autoparallel folder / refactor code (#1764 ) * [autoparallel] first move. * [autoparallel] add solver rotor. * [autoparallel] add ckpt solvers. * [autoparallel] modify codegen. * [fx] fix annotation in test. * [fx] remove check. * [autoparallel] polish docstring. * [fx] refactor MetaTensor.	2022-11-01 10:43:15 +08:00
Jiarui Fang	f34dab4270	[compatibility] ChunkMgr import error (#1772 )	2022-10-28 14:48:54 +08:00
YuliangLiu0306	b0f7c8bde8	[autoparallel] update CommSpec to CommActions (#1768 ) * [autoparallel] update CommSpec to CommActions * polish code	2022-10-28 09:57:43 +08:00
YuliangLiu0306	b4cc59b61e	[autoparallel] add numerical test for node strategies (#1760 ) * [autoparallel] add numerical test for node strategies * polish code * polish code	2022-10-27 10:42:54 +08:00
oahzxl	25952b67d7	[feat] add flash attention (#1762 )	2022-10-26 16:15:52 +08:00
Super Daniel	0584654c79	[fx] refactor memory utils and extend shard utils. (#1754 ) * [fx] change memory.py to memory_utils.py. * [fx] add shard utils. * [fx] fix import. * [fx] check code style. * [fx] add comment. * [autoparallel] first move. * [fx] add time computations.	2022-10-26 14:24:41 +08:00
Ziyue Jiang	63f250bbd4	fix file name (#1759 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2022-10-25 16:48:48 +08:00
YuliangLiu0306	314d8c497f	[autoparallel] refactor the runtime apply pass and add docstring to passes (#1757 ) * [autoparallel] refactor the runtime apply pass and add doc string to passes * fix unit test * polish	2022-10-25 14:32:22 +08:00
Frank Lee	f9a613d660	[autoparallel] added binary elementwise node handler (#1758 ) * [autoparallel] added binary elementwise node handler * polish code	2022-10-25 14:32:01 +08:00
YuliangLiu0306	d2fc067231	[autoparallel] fix param hook issue in transform pass (#1755 )	2022-10-24 13:13:38 +08:00
Frank Lee	262652c8bc	[autoparallel] added addbmm handler (#1751 )	2022-10-21 18:55:48 +08:00
YuliangLiu0306	980ed21723	[autoparallel] shard param and buffer as expected (#1753 ) * [autoparallel] shard param and buffer as expected * fix unit test issue	2022-10-21 15:45:13 +08:00
YuliangLiu0306	cdb7d5e7d2	[hotfix] autoparallel unit test (#1752 )	2022-10-20 19:51:38 +08:00
YuliangLiu0306	a4ce180e85	[autoparallel] add sequential order to communication actions (#1735 )	2022-10-20 18:48:18 +08:00
Frank Lee	474111ecb5	[autoparallel] fixed wrong sharding strategy in conv handler (#1747 ) * [autoparallel] fixed wrong sharding strategy in conv handler * polish code	2022-10-20 16:12:39 +08:00
Frank Lee	8b8937d901	[autoparallel] fixed wrong generated strategy for dot op (#1746 ) * [autoparallel] fixed wrong generated strategy for dot op * polish code	2022-10-20 15:18:16 +08:00
Frank Lee	993b8875b6	[autoparallel] handled illegal sharding strategy in shape consistency (#1744 ) * [autoparallel] handled illegal sharding strategy in shape consistency * polish code	2022-10-20 12:06:25 +08:00
Frank Lee	88a79814fb	[autoparallel] handled illegal strategy in node handler (#1743 ) * [autoparallel] handled illegal strategy in node handler * polish code	2022-10-19 17:08:52 +08:00
Super Daniel	30874f1692	[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler (#1730 ) * [fx/profiler] add test. * [fx] fix file names. * [fx] add docstring and comment. * [fx] polish profiler.py. * [fx] fix import errors. * [fx] fix profiler. * [fx] fix names.	2022-10-19 14:24:51 +08:00
Frank Lee	eee84908d4	[autoparallel] handled illegal sharding strategy (#1728 ) * [autoparallel] handled illegal sharding strategy * polish code	2022-10-19 12:53:06 +08:00
Sze-qq	23703c9dd6	[NFC] polish colossalai/nn/metric/_utils.py code style (#1727 )	2022-10-19 12:20:51 +08:00
Ofey Chan	7e62af28a0	[NFC] polish accuracy_2d.py code style (#1719 )	2022-10-19 12:20:51 +08:00
LuGY	730f88f8e1	[NFC] polish _checkpoint_hook.py code style (#1722 )	2022-10-19 12:20:51 +08:00
CsRic	ea961d8fd1	[NFC] polish colossalai/zero/sharded_param/__init__.py code style (#1717 ) Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>	2022-10-19 12:20:51 +08:00
yuxuan-lou	2b49ca80a3	[NFC] polish colossalai/nn/lr_scheduler/linear.py code style (#1716 )	2022-10-19 12:20:51 +08:00
shenggan	e1d780030d	[NFC] polish colossalai/nn/metric/accuracy_2p5d.py code style (#1714 )	2022-10-19 12:20:51 +08:00
YuliangLiu0306	d373e67b99	[hotfix] resharding cost issue (#1742 )	2022-10-19 11:33:43 +08:00
Jiarui Fang	24e84eba60	upgrade version to 0.1.11rc1 (#1739 )	2022-10-19 11:26:00 +08:00
Frank Lee	d2e0e39c9d	[release] update to v0.1.11 (#1736 )	2022-10-19 00:28:00 +08:00
HELSON	f69f9bf223	[zero] add chunk init function for users (#1729 ) * add chunk manager init function * fix unit tests * add comment * add flush=True	2022-10-18 16:31:22 +08:00
YuliangLiu0306	51b89d2202	[autoparallel] runtime_backward_apply (#1720 )	2022-10-18 10:44:58 +08:00
Super Daniel	393f594051	[fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710 ) * [fx] move meta registration * [fx] fix tests. * [fx] fix test. * [fx] fix. * [meta] refactor meta registration.py. * [fx] add compatibility descriptions. * [fx] polish import. * [fx] add a decorator. * [fx] fix tests. * [fx] remove print. * [fx] edit raise error. * [fx] edit raise error. * [fx] add type hint. * [fx] fix import in experimental. * [rpc] remove color debug. * [meta] fix naming.	2022-10-18 10:44:23 +08:00
YuliangLiu0306	845ff4a47a	[autoparallel] resnet block runtime apply (#1709 ) * [autoparallel] resnet block runtime apply * seperate buffer and parameter in MemoryCost * polish code * add comments and todos * fix test issue	2022-10-17 13:37:38 +08:00
Frank Lee	22a115406b	[autoparallel] fixed broken node handler tests (#1708 )	2022-10-14 18:25:59 +08:00
HELSON	1468e4bcfc	[zero] add constant placement policy (#1705 ) * fixes memory leak when paramter is in fp16 in ZeroDDP init. * bans chunk releasement in CUDA. Only when a chunk is about to offload, it is allowed to release. * adds a constant placement policy. With it, users can allocate a reserved caching memory space for parameters.	2022-10-14 17:53:16 +08:00
binmakeswell	5f41463a76	add optimizer README for tutorials (#1707 )	2022-10-14 09:10:18 +00:00
Frank Lee	6c331a5a09	[autoparallel] refactored the autoparallel module for organization (#1706 ) * [autoparallel] refactored the autoparallel module for organization * polish code	2022-10-14 13:27:00 +08:00
Frank Lee	91cd34e6e0	[unittest] added doc for the pytest wrapper (#1704 )	2022-10-14 10:56:17 +08:00
YuliangLiu0306	451cd72dea	[autoparallel] adapt runtime passes (#1703 ) * [autoparallel] adapt runtime passes v2 * polish code	2022-10-14 10:14:07 +08:00
Jiarui Fang	21962e1593	[embedding] rename FreqAwareEmbedding -> CachedEmbedding (#1699 )	2022-10-13 22:22:27 +08:00
Frank Lee	0e52f3d3d5	[unittest] supported condititonal testing based on env var (#1701 ) polish code	2022-10-13 19:38:45 +08:00
Frank Lee	8283e95db3	[autoparallel] collated all deprecated files (#1700 ) * [autoparallel] collated all deprecated files * polish code	2022-10-13 18:24:11 +08:00
Frank Lee	e2355d01b9	[autoparallel] init new folder structure (#1696 )	2022-10-13 14:18:55 +08:00
YuliangLiu0306	81f7530ee7	[autoparallel] adapt solver and CostGraph with new handler (#1695 ) * [autoparallel] adapt solver and CostGraph with new handler * fix test issue	2022-10-13 14:04:15 +08:00
YuliangLiu0306	42b882ef06	[autoparallel] add output handler and placeholder handler (#1694 ) * [autoparallel] add output handler and placeholder handler * Delete test_solver_with_resnet.py * fix test bugs	2022-10-13 13:42:36 +08:00
YuliangLiu0306	56088e6d98	[autoparallel] add pooling handler (#1690 ) * [autoparallel] add pooling handler * polish code	2022-10-13 13:42:13 +08:00
YuliangLiu0306	319d654f79	[autoparallel] where_handler_v2 (#1688 ) * where generator * [autoparallel] where_handler_v2	2022-10-13 11:02:22 +08:00
Boyuan Yao	31d2f03d27	[autoparallel] fix C version rotor inconsistency (#1691 )	2022-10-12 15:21:58 +08:00
Jiarui Fang	363fc2861a	[embeddings] more detailed timer (#1692 )	2022-10-12 12:01:21 +08:00
Frank Lee	4973157ad7	[autoparallel] added sharding spec conversion for linear handler (#1687 )	2022-10-12 11:16:18 +08:00
YuliangLiu0306	af718e83f2	[autoparallel] add reshape handler v2 and fix some previous bug (#1683 )	2022-10-11 18:12:59 +08:00
YuliangLiu0306	6878e42248	[hotfix] solver bug caused by dict type comm cost (#1686 )	2022-10-11 17:57:03 +08:00
Super Daniel	3dd6994427	[fx/profiler] assigned UUID to each unrecorded tensor/ improved performance on GPT-2 (#1679 ) * [fx/profiler] modify data_ptr into uuid for all tensors. * [fx] modify uuid. * [fx/profiler] tune performance on GPT-2. * [fx] updates. * [fx] debug. * [fx] debug. * [fx] cuda.	2022-10-11 11:03:35 +08:00
Kirigaya Kazuto	0df5034a36	[pipeline/fix-bug] num_microbatches support any integrate \| stable chimera \| launch tool for rpc pp framework (#1684 ) * [pipeline/tuning] improve dispatch performance both time and space cost * [pipeline/converge] add interface for testing convergence * [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style * Update PipelineBase.py * [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule \| finish Chimera * [pipeline/chimera] test chimera \| fix bug of initializing * [pipeline/pytree] add pytree to process args and kwargs \| provide to process args and kwargs after forward * [pipeline/fix-bug] num_microbatches support any integrate \| stable chimera \| launch tool for rpc pp framework	2022-10-10 16:01:02 +08:00
jim	e5ab6be72e	[hotfix[ fix colotensor.type() raise NotImplementedError (#1682 )	2022-10-10 10:13:31 +08:00
Kirigaya Kazuto	3b2a59b0ba	[pipeline/rank_recorder] fix bug when process data before backward \| add a tool for multiple ranks debug (#1681 ) * [pipeline/tuning] improve dispatch performance both time and space cost * [pipeline/converge] add interface for testing convergence * [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style * Update PipelineBase.py * [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule \| finish Chimera * [pipeline/chimera] test chimera \| fix bug of initializing * [pipeline/pytree] add pytree to process args and kwargs \| provide to process args and kwargs after forward	2022-10-09 17:32:57 +08:00
YuliangLiu0306	517b63939a	[autoparallel] add unary element wise handler v2 (#1674 )	2022-10-09 17:30:42 +08:00
YuliangLiu0306	f6c6a932b8	[autoparallel] add following node generator (#1673 ) * [autoparallel] add following node generator * polish code * polish code * update name of arguments	2022-10-09 14:49:18 +08:00
YuliangLiu0306	52fda88796	[autoparallel] add layer norm handler v2 (#1671 ) * [autoparallel] add layer norm handler v2 * polish code * polish code	2022-10-09 14:23:22 +08:00
Fazzie-Maqianli	87c5ad352a	update version to 0.1.10 (#1676 )	2022-10-09 10:43:29 +08:00
HELSON	b28991dd0a	[feature] A new ZeRO implementation (#1644 )	2022-10-09 09:18:51 +08:00
Boyuan Yao	b1be5b88bd	[autoparallel] fix insecure subprocess (#1680 ) * [autoparallel] fix insecure subprocess * [fx] fix insecure subprocess	2022-10-06 15:07:03 +08:00
Boyuan Yao	d8420f81a4	[hotfix] fix wrong type name in profiler (#1678 )	2022-10-05 21:59:05 +08:00
Boyuan Yao	132b4306b7	[fx] Add concrete info prop (#1677 ) * [fx] concreteinfoprop * [fx] add concreteinfoprop * [fx] modify docstring of ConcreteInfoProp * [fx] fix device error * [fx] modify parameter calculation * [fx] modify parameters calculation	2022-10-04 16:48:24 +08:00
Boyuan Yao	1df98d5b66	[autoparallel] add rotor C version (#1658 ) * [autoparallel] add rotor c version * [fx] remove metainfoprop in rotor solver * [autoparallel] modify C code format * [autoparallel] remove build.py * [autoparallel] fix C extension build * [autoparallel] add C solver consistency test * [autoparallel] remove some unused imports * [autoparallel] refactor rotor solver code * [autoparallel] replace print with colossalai logger * [autoparallel] ranks fixed	2022-10-03 17:13:30 +08:00
YuliangLiu0306	11ec070e53	[hotfix]unit test (#1670 )	2022-09-29 12:49:28 +08:00
Frank Lee	a60024e77a	[autoparallel] added utils for broadcast operation (#1665 ) * [autoparallel] added utils for broadcast operation * polish code	2022-09-29 11:22:29 +08:00
YuliangLiu0306	3f068d1409	[autoparallel] update CommSpec (#1667 )	2022-09-29 11:20:59 +08:00
Frank Lee	247a9dbca9	[autoparallel] added bias comm spec to matmul strategy (#1664 )	2022-09-29 11:08:05 +08:00
YuliangLiu0306	746f8f979d	[autoparallel] add batch norm handler v2 (#1666 )	2022-09-29 11:02:49 +08:00
Kirigaya Kazuto	9708638ded	[pipeline/pytree] add pytree to process args and kwargs \| provide `data_process_func` to process args and kwargs after forward (#1642 ) * [pipeline/tuning] improve dispatch performance both time and space cost * [pipeline/converge] add interface for testing convergence * [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style * Update PipelineBase.py * [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule \| finish Chimera * [pipeline/chimera] test chimera \| fix bug of initializing * [pipeline/pytree] add pytree to process args and kwargs \| provide to process args and kwargs after forward	2022-09-29 10:58:58 +08:00
YuliangLiu0306	c27e701cb2	[autoparallel] remove no strategy nodes (#1652 ) * [autoparallel] remove no strategy nodes * fix none object iteration issue	2022-09-29 10:43:25 +08:00
Frank Lee	50f16a2850	[autoparallel] added compute resharding costs for node handler (#1662 )	2022-09-28 19:55:44 +08:00
Frank Lee	9ec401a722	[autoparallel] added new strategy constructor template (#1661 ) * [autoparallel] added new strategy constructor template * polish code	2022-09-28 14:01:36 +08:00
Frank Lee	3a4d6f63a8	[autoparallel] added node handler for bmm (#1655 )	2022-09-28 11:32:16 +08:00
YuliangLiu0306	095854477f	[autoparallel] add conv handler v2 (#1663 )	2022-09-28 11:24:59 +08:00
YuliangLiu0306	1e7816a460	[autoparallel] adapt solver with gpt (#1653 )	2022-09-28 11:17:26 +08:00
Jiarui Fang	c638bec028	[embedding] polish async copy (#1657 )	2022-09-27 14:37:03 +08:00
Jiarui Fang	988570e4a6	[embedding] add more detail profiling (#1656 )	2022-09-27 13:43:59 +08:00
Jiarui Fang	e1f97fd2b8	[embedding] print profiling results (#1654 )	2022-09-27 12:50:33 +08:00
Frank Lee	30e50c8b4a	[autoparallel] implemented all matmul strategy generator (#1650 )	2022-09-27 12:06:25 +08:00
YuliangLiu0306	03978aad45	[autoparallel] change the following nodes strategies generation logic (#1636 ) * [autoparallel] change the following nodes strategies generation logic * fix unit test	2022-09-27 11:20:52 +08:00
YuliangLiu0306	59f100510a	[autoparallel] where handler (#1651 ) * [autoparallel] where handler * fix unit test	2022-09-27 11:20:43 +08:00
Super Daniel	6135e178b3	[fx] refactor code for profiler / enable fake tensor movement. (#1646 ) * [fx/profiling] provide summary for MetaInfoProp. * [fx/profiler] provide a table of summary. * [fx/profiler] provide a table of summary. * [fx/profiler] provide a table of summary. * [fx/profiler] provide a table of summary. * [fx] optimize table repr. * [fx] optimize table repr. * [fx] refactor code for profiler. * [fx] add docstring. * [fx] add docstring. * [fx] skip test. * [fx] redo. * [fx] redo. * [fx] fix import error for torch11. * [fx] fix import error for torch11.	2022-09-27 10:26:52 +08:00
Boyuan Yao	5d0fdb9cb4	[fx] fix offload codegen test (#1648 ) * [fx] fix offload codegen test * [fx] modify typing	2022-09-27 10:25:27 +08:00
Frank Lee	45b39a692a	[autoparallel] implemented linear projection strategy generator (#1639 )	2022-09-26 16:58:14 +08:00
Frank Lee	154d3ef432	[fix] fixed the collective pattern name for consistency (#1649 ) * [fix] fixed the collective pattern name for consistency * polish code	2022-09-26 16:39:37 +08:00
YuliangLiu0306	b2b2a4af98	[autoparallel] adapt solver with mlp (#1638 )	2022-09-26 15:26:14 +08:00
Jiarui Fang	04443605a5	[embedding] non-blocking cpu-gpu copy (#1647 )	2022-09-26 14:57:57 +08:00
CsRic	0767f67a0f	[embedding] isolate cache_op from forward (#1645 ) Co-authored-by: ric <mkkt_bkkt@mail.ustc.edu.cn>	2022-09-26 11:18:59 +08:00
Jiarui Fang	c5d39215f6	Revert "[feature] new zero implementation (#1623 )" (#1643 ) This reverts commit `5be118f405`.	2022-09-26 10:06:03 +08:00
HELSON	5be118f405	[feature] new zero implementation (#1623 )	2022-09-24 19:58:18 +08:00

1 2 3 4 5 ...

965 Commits (7c7921f71bf93e739b1939c724a4cfe9cd405247)