ColossalAI

Commit Graph

Author	SHA1	Message	Date
Jianghai	93aeacca34	[Inference]Update inference config and fix test (#5178 ) * unify the config setting * fix test * fix import * fix test * fix * fix * add logger * revise log info --------- Co-authored-by: CjhHa1 <cjh18671720497outlook.com>	2024-01-11 13:39:29 +00:00
Yuanheng Zhao	3de2e62299	[Inference] Add CacheBlock and KV-Cache Manager (#5156 ) * [Inference] Add KVCache Manager * function refactored * add test for KVCache Manager * add attr beam width * Revise alloc func in CacheManager * Fix docs and pytests * add tp slicing for head number * optimize shapes of tensors used as physical cache * Apply using InferenceConfig on KVCacheManager * rm duplicate config file * Optimize cache allocation: use contiguous cache * Fix config in pytest (and config)	2024-01-11 13:39:29 +00:00
yuehuayingxueluo	fab9b931d9	[Inference]Add BatchInferState, Sequence and InferConfig (#5149 ) * add infer_struct and infer_config * update codes * change InferConfig * Add hf_model_config to the engine * rm _get_hf_model_config * update codes * made adjustments according to the feedback from the reviewer. * update codes * add ci test for config and struct	2024-01-11 13:39:29 +00:00
Yuanheng Zhao	2bb92243d4	[Inference/NFC] Clean outdated inference tests and deprecated kernels (#5159 ) * [inference/nfc] remove outdated inference tests * remove outdated kernel tests * remove deprecated triton kernels * remove imports from deprecated kernels	2024-01-11 13:39:29 +00:00
Jianghai	56e75eeb06	[Inference] Add readme (roadmap) and fulfill request handler (#5147 ) * request handler * add readme --------- Co-authored-by: CjhHa1 <cjh18671720497outlook.com>	2024-01-11 13:39:29 +00:00
Jianghai	4cf4682e70	[Inference] First PR for rebuild colossal-infer (#5143 ) * add engine and scheduler * add dirs --------- Co-authored-by: CjhHa1 <cjh18671720497outlook.com>	2024-01-11 13:39:29 +00:00
binmakeswell	c174c4fc5f	[doc] fix doc typo (#5256 ) * [doc] fix annotation display * [doc] fix llama2 doc	2024-01-11 21:01:11 +08:00
flybird11111	e830ef917d	[ci] fix shardformer tests. (#5255 ) * fix ci fix * revert: revert p2p * feat: add enable_metadata_cache option * revert: enable t5 tests --------- Co-authored-by: Wenhao Chen <cwher@outlook.com>	2024-01-11 19:07:45 +08:00
digger yu	756c400ad2	fix typo in applications/ColossalEval/README.md (#5250 )	2024-01-11 17:58:38 +08:00
Frank Lee	2b83418719	[ci] fixed ddp test (#5254 ) * [ci] fixed ddp test * polish	2024-01-11 17:16:32 +08:00
Frank Lee	d5eeeb1416	[ci] fixed booster test (#5251 ) * [ci] fixed booster test * [ci] fixed booster test * [ci] fixed booster test	2024-01-11 16:04:45 +08:00
Frank Lee	edf94a35c3	[workflow] fixed build CI (#5240 ) * [workflow] fixed build CI * polish * polish * polish * polish * polish	2024-01-10 22:34:16 +08:00
digger yu	41e52c1c6e	[doc] fix typo in Colossal-LLaMA-2/README.md (#5247 )	2024-01-10 19:24:56 +08:00
Elsa Granger	d565df3821	[pipeline] A more general _communicate in p2p (#5062 ) * A more general _communicate * feat: finish tree_flatten version p2p * fix: update p2p api calls --------- Co-authored-by: Wenhao Chen <cwher@outlook.com>	2024-01-08 15:37:27 +08:00
binmakeswell	7bc6969ce6	[doc] SwiftInfer release (#5236 ) * [doc] SwiftInfer release * [doc] SwiftInfer release * [doc] SwiftInfer release * [doc] SwiftInfer release * [doc] SwiftInfer release	2024-01-08 09:55:12 +08:00
github-actions[bot]	4fb4a22a72	[format] applied code formatting on changed files in pull request 5234 (#5235 ) Co-authored-by: github-actions <github-actions@github.com>	2024-01-07 20:55:34 +08:00
binmakeswell	b9b32b15e6	[doc] add Colossal-LLaMA-2-13B (#5234 ) * [doc] add Colossal-LLaMA-2-13B * [doc] add Colossal-LLaMA-2-13B * [doc] add Colossal-LLaMA-2-13B	2024-01-07 20:53:12 +08:00
JIMMY ZHAO	ce651270f1	[doc] Make leaderboard format more uniform and good-looking (#5231 ) * Make leaderboard format more unifeid and good-looking * Update README.md * Update README.md	2024-01-06 17:12:29 +08:00
Camille Zhong	915b4652f3	[doc] Update README.md of Colossal-LLAMA2 (#5233 ) * Update README.md * Update README.md	2024-01-06 17:06:41 +08:00
Tong Li	d992b55968	[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model (#5224 ) * update readme * update readme * update link * update * update readme * update * update * update * update title * update example * update example * fix content * add conclusion * add license * update * update * update version * fix minor	2024-01-05 17:24:26 +08:00
digger yu	b0b53a171c	[nfc] fix typo colossalai/shardformer/ (#5133 )	2024-01-04 16:21:55 +08:00
flybird11111	451e9142b8	fix flash attn (#5209 )	2024-01-03 14:39:53 +08:00
flybird11111	365671be10	fix-test (#5210 ) fix-test fix-test	2024-01-03 14:26:13 +08:00
Hongxin Liu	7f3400b560	[devops] update torch versoin in ci (#5217 )	2024-01-03 11:46:33 +08:00
Wenhao Chen	d799a3088f	[pipeline]: add p2p fallback order and fix interleaved pp deadlock (#5214 ) * fix: add fallback order option and update 1f1b * fix: fix deadlock comm in interleaved pp * test: modify p2p test	2024-01-03 11:34:49 +08:00
Wenhao Chen	3c0d82b19b	[pipeline]: support arbitrary batch size in forward_only mode (#5201 ) * fix: remove drop last in val & test dataloader * feat: add run_forward_only, support arbitrary bs * chore: modify ci script	2024-01-02 23:41:12 +08:00
flybird11111	02d2328a04	support linear accumulation fusion (#5199 ) support linear accumulation fusion support linear accumulation fusion fix	2023-12-29 18:22:42 +08:00
Zhongkai Zhao	64519eb830	[doc] Update required third-party library list for testing and torch comptibility checking (#5207 ) * doc/update requirements-test.txt * update torch-cuda compatibility check	2023-12-27 18:03:45 +08:00
Yuanchen	eae01b6740	Improve logic for selecting metrics (#5196 ) Co-authored-by: Xu <yuanchen.xu00@gmail.com>	2023-12-22 14:52:50 +08:00
Wenhao Chen	4fa689fca1	[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134 ) * test: add more p2p tests * fix: remove send_forward_recv_forward as p2p op list need to use the same group * fix: make send and receive atomic * feat: update P2PComm fn * feat: add metadata cache in 1f1b * feat: add metadata cache in interleaved pp * feat: modify is_xx_stage fn * revert: add _broadcast_object_list * feat: add interleaved pp in llama policy * feat: set NCCL_BUFFSIZE in HybridParallelPlugin	2023-12-22 10:44:00 +08:00
BlueRum	af952673f7	polish readme in application/chat (#5194 )	2023-12-20 11:28:39 +08:00
flybird11111	681d9b12ef	[doc] update pytorch version in documents. (#5177 ) * fix aaa fix fix fix * fix * fix * test ci * fix ci fix * update pytorch version in documents	2023-12-15 18:16:48 +08:00
Yuanchen	3ff60d13b0	Fix ColossalEval (#5186 ) Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>	2023-12-15 15:06:06 +08:00
flybird11111	79718fae04	[shardformer] llama support DistCrossEntropy (#5176 ) * fix aaa fix fix fix * fix * fix * test ci * fix ci fix * llama support dist-cross fix fix fix fix fix fix fix fix * fix * fix * fix fix * test ci * test ci * fix * [Colossal-Llama-2] Add finetuning Colossal-Llama-2 example (#4878) * Add finetuning Colossal-Llama-2 example * Add finetuning Colossal-Llama-2 example 2 * Add finetuning Colossal-Llama-2 example and support NEFTuning * Add inference example and refine neftune * Modify readme file * update the imports --------- Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com> Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com> * llama support dist-cross fix fix fix fix fix fix fix fix * fix * fix * fix fix * test ci * test ci * fix * fix ci * fix ci --------- Co-authored-by: Yuanchen <70520919+chengeharrison@users.noreply.github.com> Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com> Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>	2023-12-13 01:39:14 +08:00
Yuanchen	cefdc32615	[ColossalEval] Support GSM, Data Leakage Evaluation and Tensor Parallel (#5169 ) * Support GSM, Data Leakage Evaluation and Tensor Parallel * remove redundant code and update inference.py in examples/gpt_evaluation --------- Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>	2023-12-12 14:47:35 +08:00
Michelle	b07a6f4e27	[colossalqa] fix pangu api (#5170 ) * fix pangu api * add comment	2023-12-11 14:08:11 +08:00
flybird11111	21aa5de00b	[gemini] hotfix NaN loss while using Gemini + tensor_parallel (#5150 ) * fix aaa fix fix fix * fix * fix * test ci * fix ci fix	2023-12-08 11:10:51 +08:00
Yuanchen	b397104438	[Colossal-Llama-2] Add finetuning Colossal-Llama-2 example (#4878 ) * Add finetuning Colossal-Llama-2 example * Add finetuning Colossal-Llama-2 example 2 * Add finetuning Colossal-Llama-2 example and support NEFTuning * Add inference example and refine neftune * Modify readme file * update the imports --------- Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com> Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>	2023-12-07 14:02:03 +08:00
flybird11111	3dbbf83f1c	fix (#5158 ) fix	2023-12-05 14:28:36 +08:00
Michelle	368b5e3d64	[doc] fix colossalqa document (#5146 ) * fix doc * modify doc	2023-12-01 21:39:53 +08:00
Michelle	c7fd9a5213	[ColossalQA] refactor server and webui & add new feature (#5138 ) * refactor server and webui & add new feature * add requirements * modify readme and ui	2023-11-30 22:55:52 +08:00
flybird11111	2a2ec49aa7	[plugin]fix 3d checkpoint load when booster boost without optimizer. (#5135 ) * fix 3d checkpoint load when booster boost without optimizer fix 3d checkpoint load when booster boost without optimizer * test ci * revert ci * fix fix	2023-11-30 18:37:47 +08:00
github-actions[bot]	f6731db67c	[format] applied code formatting on changed files in pull request 5115 (#5118 ) Co-authored-by: github-actions <github-actions@github.com>	2023-11-29 13:39:14 +08:00
github-actions[bot]	9b36640f28	[format] applied code formatting on changed files in pull request 5124 (#5125 ) Co-authored-by: github-actions <github-actions@github.com>	2023-11-29 13:39:02 +08:00
github-actions[bot]	d10ee42f68	[format] applied code formatting on changed files in pull request 5088 (#5127 ) Co-authored-by: github-actions <github-actions@github.com>	2023-11-29 13:38:37 +08:00
digger yu	9110406a47	fix typo change JOSNL TO JSONL etc. (#5116 )	2023-11-29 11:08:32 +08:00
Frank Lee	2899cfdabf	[doc] updated paper citation (#5131 )	2023-11-29 10:47:51 +08:00
binmakeswell	177c79f2d1	[doc] add moe news (#5128 ) * [doc] add moe news * [doc] add moe news * [doc] add moe news	2023-11-28 17:44:06 +08:00
Wenhao Chen	7172459e74	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 ) * [shardformer] implement policy for all GPT-J models and test * [shardformer] support interleaved pipeline parallel for bert finetune * [shardformer] shardformer support falcon (#4883) * [shardformer]: fix interleaved pipeline for bert model (#5048) * [hotfix]: disable seq parallel for gptj and falcon, and polish code (#5093) * Add Mistral support for Shardformer (#5103) * [shardformer] add tests to mistral (#5105) --------- Co-authored-by: Pengtai Xu <henryxu880@gmail.com> Co-authored-by: ppt0011 <143150326+ppt0011@users.noreply.github.com> Co-authored-by: flybird11111 <1829166702@qq.com> Co-authored-by: eric8607242 <e0928021388@gmail.com>	2023-11-28 16:54:42 +08:00
アマデウス	126cf180bc	[hotfix] fixed memory usage of shardformer module replacement (#5122 )	2023-11-28 15:38:26 +08:00

1 2 3 4 5 ...

2949 Commits (93aeacca342ab03732362dbb9096ab1265f4a8b3) All Branches Search

2949 Commits (93aeacca342ab03732362dbb9096ab1265f4a8b3)

All Branches