ColossalAI

Commit Graph

Author	SHA1	Message	Date
oahzxl	71e72c4890	last version of benchmark	2023-01-05 17:54:25 +08:00
YuliangLiu0306	b5a3a4a65f	[device] find best logical mesh	2023-01-05 17:21:29 +08:00
yuxuan-lou	28e2d16794	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style (#2340 )	2023-01-05 16:53:24 +08:00
YuliangLiu0306	9c9246c0d9	[device] alpha beta profiler (#2311 ) * [device] alpha beta profiler * add usage * fix variable name	2023-01-05 16:39:55 +08:00
Maruyama_Aya	bd12a49e2a	[NFC] polish <colossalai/auto_parallel/tensor_shard/deprecated/constants.py> code style (#2339 )	2023-01-05 16:20:54 +08:00
Haofan Wang	9edd0aa75e	Update train_dreambooth_colossalai.py accelerator.num_processes -> gpc.get_world_size(ParallelMode.DATA)	2023-01-05 15:49:57 +08:00
Frank Lee	f1bc2418c4	[setup] make cuda extension build optional (#2336 ) * [setup] make cuda extension build optional * polish code * polish code * polish code	2023-01-05 15:13:11 +08:00
Frank Lee	8711310cda	[setup] remove torch dependency (#2333 )	2023-01-05 13:53:28 +08:00
Zihao	35427bcab4	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/unary_elementwise_handler.py code style (#2326 )	2023-01-05 12:18:08 +08:00
oahzxl	55cb713f36	update min memory stratege, reduce mem usage by 30%	2023-01-05 11:29:22 +08:00
Fazzie-Maqianli	89f26331e9	[example] diffusion update diffusion,Dreamblooth (#2329 )	2023-01-05 11:23:26 +08:00
Frank Lee	6e34cc0830	[workflow] fixed pypi release workflow error (#2328 )	2023-01-05 10:52:43 +08:00
Frank Lee	2916eed34a	[workflow] fixed pypi release workflow error (#2327 )	2023-01-05 10:48:38 +08:00
Frank Lee	8d8dec09ba	[workflow] added workflow to release to pypi upon version change (#2320 ) * [workflow] added workflow to release to pypi upon version change * polish code * polish code * polish code	2023-01-05 10:40:18 +08:00
Frank Lee	693ef121a1	[workflow] removed unused assign reviewer workflow (#2318 )	2023-01-05 10:40:07 +08:00
binmakeswell	e512ca9c24	[doc] update stable diffusion link (#2322 ) * [doc] update link	2023-01-04 19:38:06 +08:00
Frank Lee	e8dfa2e2e0	[workflow] rebuild cuda kernels when kernel-related files change (#2317 )	2023-01-04 17:23:59 +08:00
Jiarui Fang	db6eea3583	[builder] reconfig op_builder for pypi install (#2314 )	2023-01-04 16:32:32 +08:00
Fazzie-Maqianli	a9b27b9265	[exmaple] fix dreamblooth format (#2315 )	2023-01-04 16:20:00 +08:00
Sze-qq	da1c47f060	update ColossalAI logo (#2316 ) Co-authored-by: siqi <siqi@siqis-MacBook-Pro.local>	2023-01-04 15:41:53 +08:00
Junming Wu	4a79c10750	[NFC] polish colossalai/cli/benchmark/__init__.py code style (#2308 )	2023-01-04 15:09:57 +08:00
Ofey Chan	87d2defda6	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style (#2305 )	2023-01-04 15:09:57 +08:00
ver217	116e3d0b8f	[NFC] polish communication/p2p_v2.py code style (#2303 )	2023-01-04 15:09:57 +08:00
xyupeng	b965585d05	[NFC] polish colossalai/amp/torch_amp/torch_amp.py code style (#2290 )	2023-01-04 15:09:57 +08:00
Zangwei Zheng	d1e5bafcd4	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style (#2291 )	2023-01-04 15:09:57 +08:00
shenggan	950685873f	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style (#2292 )	2023-01-04 15:09:57 +08:00
Ziheng Qin	3041014089	[NFC] polish colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py code style (#2299 ) Co-authored-by: henryqin1997 <henryqin1997@gamil.com>	2023-01-04 15:09:57 +08:00
アマデウス	49715a78f0	[NFC] polish colossalai/cli/benchmark/benchmark.py code style (#2287 )	2023-01-04 15:09:57 +08:00
Zirui Zhu	1c29b173c9	[NFC] polish colossalai/auto_parallel/tensor_shard/node_handler/getitem_handler.py code style (#2289 )	2023-01-04 15:09:57 +08:00
Zihao	3a02b46447	[auto-parallel] refactoring ColoTracer (#2118 ) * add meta_data_computing * add checkpoint_annotation * rename proxy.data to proxy.meta_data and add bias addition pass * polish code * delete meta_prop_pass invoke and rename ori_node to orig_node * add TracerType * unify meta data computing * delete TracerType * handle setitem operation * operator.setitem	2023-01-04 14:44:22 +08:00
Jiarui Fang	32253315b4	[example] update diffusion readme with official lightning (#2304 )	2023-01-04 13:13:38 +08:00
HELSON	5d3a2be3af	[amp] add gradient clipping for unit tests (#2283 ) * [amp] add gradient clipping in unit tests * fix bugs	2023-01-04 11:59:56 +08:00
HELSON	e00cedd181	[example] update gemini benchmark bash (#2306 )	2023-01-04 11:59:26 +08:00
Frank Lee	9b765e7a69	[setup] removed the build dependency on colossalai (#2307 )	2023-01-04 11:38:42 +08:00
Boyuan Yao	d45695d94e	Merge pull request #2258 from hpcaitech/debug/ckpt-autoparallel [autockpt] provide option for activation checkpoint search in SPMD solver	2023-01-04 11:37:28 +08:00
binmakeswell	c8144223b8	[doc] update diffusion doc (#2296 )	2023-01-03 21:27:44 +08:00
binmakeswell	2fac699923	[doc] update news (#2295 )	2023-01-03 21:09:11 +08:00
binmakeswell	4b72b2d4d3	[doc] update news	2023-01-03 21:05:54 +08:00
Jiarui Fang	16cc8e6aa7	[builder] MOE builder (#2277 )	2023-01-03 20:29:39 +08:00
Boyuan Yao	b904748210	[autoparallel] bypass MetaInfo when unavailable and modify BCAST_FUNC_OP metainfo (#2293 ) * [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline * [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop * [autoparallel] specifycomm nodes' memory cost in construct chain * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation * [autoparallel] bypass metainfo when available and modify BCAST_FUNC_OP	2023-01-03 20:28:01 +08:00
Jiarui Fang	26e171af6c	[version] 0.1.14 -> 0.2.0 (#2286 )	2023-01-03 20:25:13 +08:00
Super Daniel	8ea50d999e	[hotfix] pass a parameter. (#2288 ) * [autockpt] make it work. * [autockpt] linearize / merge shape-consistency nodes. * [autockpt] considering parameter and optimizer weights. * [hotfix] pass a parameter.	2023-01-03 18:05:06 +08:00
ZijianYY	df1d6dc553	[examples] using args and combining two versions for PaLM (#2284 )	2023-01-03 17:49:00 +08:00
zbian	e94c79f15b	improved allgather & reducescatter for 3d	2023-01-03 17:46:08 +08:00
binmakeswell	c719798abe	[doc] add feature diffusion v2, bloom, auto-parallel (#2282 )	2023-01-03 17:35:07 +08:00
HELSON	62c38e3330	[zero] polish low level zero optimizer (#2275 )	2023-01-03 17:22:34 +08:00
Ziyue Jiang	ac863a01d6	[example] add benchmark (#2276 ) * add benchmark * merge common func * add total and avg tflops Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2023-01-03 17:20:59 +08:00
Boyuan Yao	22e947f982	[autoparallel] fix runtime apply memory estimation (#2281 ) * [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline * [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop * [autoparallel] specifycomm nodes' memory cost in construct chain * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation	2023-01-03 17:18:07 +08:00
BlueRum	1405b4381e	[example] fix save_load bug for dreambooth (#2280 )	2023-01-03 17:13:29 +08:00
Super Daniel	8e8900ff3f	[autockpt] considering parameter and optimizer weights. (#2279 ) * [autockpt] make it work. * [autockpt] linearize / merge shape-consistency nodes. * [autockpt] considering parameter and optimizer weights.	2023-01-03 16:55:49 +08:00

... 7 8 9 10 11 ...

2089 Commits (91ccf97514af50111551e88a8a194c60f82590b4) All Branches Search

2089 Commits (91ccf97514af50111551e88a8a194c60f82590b4)

All Branches