ColossalAI

Commit Graph

Author	SHA1	Message	Date
Guangyao Zhang	f20b066c59	[fp8] Disable all_gather intranode. Disable Redundant all_gather fp8 (#6059 ) * all_gather only internode, fix pytest * fix cuda arch <89 compile pytest error * fix pytest failure * disable all_gather_into_tensor_flat_fp8 * fix fp8 format * fix pytest * fix conversations * fix chunk tuple to list	2024-09-14 10:40:01 +08:00
flybird11111	20722a8c93	[fp8]update reduce-scatter test (#6002 ) * fix * fix * fix * fix	2024-08-15 14:40:54 +08:00
flybird11111	597b206001	[fp8] support asynchronous FP8 communication (#5997 ) * fix * fix * fix * support async all2all * support async op for all gather * fix * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-14 14:08:19 +08:00
Hongxin Liu	8241c0c054	[fp8] support gemini plugin (#5978 ) * [fp8] refactor hook * [fp8] support gemini plugin * [example] add fp8 option for llama benchmark	2024-08-09 14:09:48 +08:00
Hanks	b480eec738	[Feature]: support FP8 communication in DDP, FSDP, Gemini (#5928 ) * support fp8_communication in the Torch DDP grad comm, FSDP grad comm, and FSDP params comm * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * implement communication hook for FSDP params all-gather * added unit test for fp8 operators * support fp8 communication in GeminiPlugin * update training scripts to support fsdp and fp8 communication * fixed some minor bugs observed in unit test * add all_gather_into_tensor_flat_fp8 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add skip the test if torch < 2.2.0 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add skip the test if torch < 2.2.0 * add skip the test if torch < 2.2.0 * add fp8_comm flag * rebase latest fp8 operators * rebase latest fp8 operators * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-08 15:55:01 +08:00
Hongxin Liu	ccabcf6485	[fp8] support fp8 amp for hybrid parallel plugin (#5975 ) * [fp8] support fp8 amp for hybrid parallel plugin * [test] add fp8 hook test * [fp8] fix fp8 linear compatibility	2024-08-07 18:21:08 +08:00
Hongxin Liu	76ea16466f	[fp8] add fp8 linear (#5967 ) * [fp8] add fp8 linear * [test] fix fp8 linear test condition * [test] fix fp8 linear test condition * [test] fix fp8 linear test condition	2024-08-07 15:41:49 +08:00
flybird11111	afb26de873	[fp8]support all2all fp8 (#5953 ) * support all2all fp8 * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix * fix * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-08-06 16:58:23 +08:00
Guangyao Zhang	53cb9606bd	[Feature] llama shardformer fp8 support (#5938 ) * add llama shardformer fp8 * Llama Shardformer Parity * fix typo * fix all reduce * fix pytest failure * fix reduce op and move function to fp8.py * fix typo	2024-08-05 10:05:47 +08:00
Hongxin Liu	5fd0592767	[fp8] support all-gather flat tensor (#5932 )	2024-07-24 16:55:20 +08:00

10 Commits (89a9a600bc4802c912b0ed48d48f70bbcdd8142b)