ColossalAI

Commit Graph

Author	SHA1	Message	Date
YH	7b13f7db18	[zero] trivial zero optimizer refactoring (#2869 ) * Fix mionr grad store interface * Apply lint	2023-02-27 14:04:53 +08:00
fastalgo	dbc01b9c04	Update README.md	2023-02-25 12:27:10 +08:00
Frank Lee	e33c043dec	[workflow] moved pre-commit to post-commit (#2895 )	2023-02-24 14:41:33 +08:00
Jiatong (Julius) Han	8c8a39be95	[hotfix]: Remove math.prod dependency (#2837 ) * Remove math.prod dependency * Fix style * Fix style --------- Co-authored-by: Jiatong Han <jiatong.han@u.nus.edu>	2023-02-23 23:56:15 +08:00
YuliangLiu0306	819e25d8b1	[hotfix] fix autoparallel compatibility test issues (#2754 )	2023-02-23 17:28:36 +08:00
YuliangLiu0306	0f392d7403	[autoparallel] find repeat blocks (#2854 ) * [autoparallel] find repeat blocks * polish * polish * polish	2023-02-23 17:28:19 +08:00
BlueRum	2e16f842a9	[chatgpt]support opt & gpt for rm training (#2876 )	2023-02-22 16:58:11 +08:00
junxu	c52edcf0eb	Rename class method of ZeroDDP (#2692 )	2023-02-22 15:05:53 +08:00
HELSON	6e4ac08172	[hotfix] fix chunk size can not be divided (#2867 ) * [hotfix] fix chunk size can not be divided * [hotfix] use numpy for python3.8	2023-02-22 15:04:46 +08:00
Alex_996	a4fc125c34	Fix typos (#2863 ) Fix typos, `6.7 -> 6.7b`	2023-02-22 10:59:48 +08:00
dawei-wang	55424a16a5	[doc] fix GPT tutorial (#2860 ) Fix hpcaitech/ColossalAI#2851	2023-02-22 10:58:52 +08:00
Boyuan Yao	eae77c831d	[autoparallel] Patch meta information for nodes that will not be handled by SPMD solver (#2823 ) * [autoparallel] non spmd meta information generator * [autoparallel] patch meta information for non spmd nodes	2023-02-22 10:28:56 +08:00
Boyuan Yao	c7764d3f22	[autoparallel] Patch meta information of `torch.where` (#2822 ) * [autoparallel] patch meta information of torch.where * [autoparallel] pre-commit modified	2023-02-22 10:28:21 +08:00
Boyuan Yao	fcc4097efa	[autoparallel] Patch meta information of `torch.tanh()` and `torch.nn.Dropout` (#2773 ) * [autoparallel] tanh meta information * [autoparallel] remove redundant code * [autoparallel] patch meta information of torch.nn.Dropout	2023-02-22 10:27:59 +08:00
BlueRum	34ca324b0d	[chatgpt] Support saving ckpt in examples (#2846 ) * [chatgpt]fix train_rm bug with lora * [chatgpt]support colossalai strategy to train rm * fix pre-commit * fix pre-commit 2 * [chatgpt]fix rm eval typo * fix rm eval * fix pre commit * add support of saving ckpt in examples * fix single-gpu save	2023-02-22 10:00:26 +08:00
Zheng Zeng	597914317b	[doc] fix typo in opt inference tutorial (#2849 )	2023-02-21 17:16:13 +08:00
Frank Lee	935346430f	[cli] handled version check exceptions (#2848 ) * [cli] handled version check exceptions * polish code	2023-02-21 17:04:49 +08:00
BlueRum	3eebc4dff7	[chatgpt] fix rm eval (#2829 ) * [chatgpt]fix train_rm bug with lora * [chatgpt]support colossalai strategy to train rm * fix pre-commit * fix pre-commit 2 * [chatgpt]fix rm eval typo * fix rm eval * fix pre commit	2023-02-21 11:35:45 +08:00
Frank Lee	918bc94b6b	[triton] added copyright information for flash attention (#2835 ) * [triton] added copyright information for flash attention * polish code	2023-02-21 11:25:57 +08:00
Boyuan Yao	7ea6bc7f69	[autoparallel] Patch tensor related operations meta information (#2789 ) * [autoparallel] tensor related meta information prototype * [autoparallel] tensor related meta information * [autoparallel] tensor related meta information * [autoparallel] tensor related meta information * [autoparallel] tensor related meta information	2023-02-20 17:38:55 +08:00
github-actions[bot]	a5721229d9	Automated submodule synchronization (#2740 ) Co-authored-by: github-actions <github-actions@github.com>	2023-02-20 17:35:46 +08:00
Haofan Wang	47ecb22387	[example] add LoRA support (#2821 ) * add lora * format	2023-02-20 16:23:12 +08:00
ver217	b6a108cb91	[chatgpt] add test checkpoint (#2797 ) * [chatgpt] add test checkpoint * [chatgpt] test checkpoint use smaller model	2023-02-20 15:22:36 +08:00
Michelle	c008d4ad0c	[NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style (#2744 )	2023-02-20 10:38:40 +08:00
mickogoin	58abde2857	Update README.md (#2791 ) Fixed typo on line 285 from "defualt" to "default"	2023-02-20 10:37:57 +08:00
Marco Rodrigues	89f0017a9c	Typo (#2826 )	2023-02-20 10:36:23 +08:00
Jiarui Fang	bf0204604f	[exmaple] add bert and albert (#2824 )	2023-02-20 10:35:55 +08:00
YuliangLiu0306	cf6409dd40	Hotfix/auto parallel zh doc (#2820 ) * [hotfix] fix autoparallel zh docs * polish * polish	2023-02-19 15:57:14 +08:00
YuliangLiu0306	2059fdd6b0	[hotfix] add copyright for solver and device mesh (#2803 ) * [hotfix] add copyright for solver and device mesh * add readme * add alpa license * polish	2023-02-18 21:14:38 +08:00
LuGY	dbd0fd1522	[CI/CD] fix nightly release CD running on forked repo (#2812 ) * [CI/CD] fix nightly release CD running on forker repo * fix misunderstanding of dispatch * remove some build condition, enable notify even when release failed	2023-02-18 13:27:13 +08:00
Boyuan Yao	8593ae1a3f	[autoparallel] rotor solver refactor (#2813 ) * [autoparallel] rotor solver refactor * [autoparallel] rotor solver refactor	2023-02-18 11:30:15 +08:00
binmakeswell	09f457479d	[doc] update OPT serving (#2804 ) * [doc] update OPT serving * [doc] update OPT serving	2023-02-17 23:21:42 +08:00
HELSON	56ddc9ca7a	[hotfix] add correct device for fake_param (#2796 )	2023-02-17 15:29:07 +08:00
ver217	a619a190df	[chatgpt] update readme about checkpoint (#2792 ) * [chatgpt] add save/load checkpoint sample code * [chatgpt] add save/load checkpoint readme * [chatgpt] refactor save/load checkpoint readme	2023-02-17 12:43:31 +08:00
ver217	4ee311c026	[chatgpt] startegy add prepare method (#2766 ) * [chatgpt] startegy add prepare method * [chatgpt] refactor examples * [chatgpt] refactor strategy.prepare * [chatgpt] support save/load checkpoint * [chatgpt] fix unwrap actor * [chatgpt] fix unwrap actor	2023-02-17 11:27:27 +08:00
Boyuan Yao	a2b43e393d	[autoparallel] Patch meta information of `torch.nn.Embedding` (#2760 ) * [autoparallel] embedding metainfo * [autoparallel] fix function name in test_activation_metainfo * [autoparallel] undo changes in activation metainfo and related tests	2023-02-17 10:39:48 +08:00
Boyuan Yao	8e3f66a0d1	[zero] fix wrong import (#2777 )	2023-02-17 10:26:07 +08:00
Fazzie-Maqianli	ba84cd80b2	fix pip install colossal (#2764 )	2023-02-17 09:54:21 +08:00
Nikita Shulga	01066152f1	Don't use `torch._six` (#2775 ) * Don't use `torch._six` This is a private API which is gone after https://github.com/pytorch/pytorch/pull/94709 * Update common.py	2023-02-17 09:22:45 +08:00
ver217	a88bc828d5	[chatgpt] disable shard init for colossalai (#2767 )	2023-02-16 20:09:34 +08:00
binmakeswell	d6d6dec190	[doc] update example and OPT serving link (#2769 ) * [doc] update OPT serving link * [doc] update example and OPT serving link * [doc] update example and OPT serving link	2023-02-16 20:07:25 +08:00
Frank Lee	e376954305	[doc] add opt service doc (#2747 )	2023-02-16 15:45:26 +08:00
BlueRum	613efebc5c	[chatgpt] support colossalai strategy to train rm (#2742 ) * [chatgpt]fix train_rm bug with lora * [chatgpt]support colossalai strategy to train rm * fix pre-commit * fix pre-commit 2	2023-02-16 11:24:07 +08:00
BlueRum	648183a960	[chatgpt]fix train_rm bug with lora (#2741 )	2023-02-16 10:25:17 +08:00
fastalgo	b6e3b955c3	Update README.md	2023-02-16 07:39:46 +08:00
binmakeswell	30aee9c45d	[NFC] polish code format [NFC] polish code format	2023-02-15 23:21:36 +08:00
YuliangLiu0306	1dc003c169	[autoparallel] distinguish different parallel strategies (#2699 )	2023-02-15 22:28:28 +08:00
YH	ae86a29e23	Refact method of grad store (#2687 )	2023-02-15 22:27:58 +08:00
cloudhuang	43dffdaba5	[doc] fixed a typo in GPT readme (#2736 )	2023-02-15 22:24:45 +08:00
binmakeswell	93b788b95a	Merge branch 'main' into fix/format	2023-02-15 20:23:51 +08:00

1 2 3 4 5 ...

2033 Commits (7b13f7db18999c611db21a17b7388709af75eda1) All Branches Search

2033 Commits (7b13f7db18999c611db21a17b7388709af75eda1)

All Branches