InternLM

Commit Graph

Author	SHA1	Message	Date
Guoteng	7c820cfa40	feat(init): add skip args check flag and add zero overlap flag (#222 ) * feat(init): add skip args check flag * fix(optim): add param overlap enable flag	2023-08-24 16:44:18 +08:00
ytxiong	eee93b5a68	test(model): support fp32 with flash_attn (#223 ) * support tf32 with flash * move autocast to attention * fix lint * fix lint * fix lint * fix lint * fix some bugs in model * modify the convert dtype	2023-08-24 13:54:44 +08:00
ytxiong	a017cab4b3	fix(): move sequence_parallel to parallel config (#224 ) move sequence_parallel to parallel config * set the sequece_parallel default value is False * fix lint * fix lint * fix lint	2023-08-24 09:49:04 +08:00
ytxiong	c219065348	feat(): support sequence_parallel (#180 ) support sequence_parallel for no pipeline * sequence_parallel does not support no-flash-attn * support sequence parallel for pipeline * add memory profiler * Update 13B.py * add memory profiler * fix evaluation bug * remove some unnecessary code * remove some unnecessary code * Update parallel_context.py * modify the config * remove memory profiler * modify the config * support selective dropout	2023-08-07 16:42:52 +08:00
ytxiong	853becfb6e	feat(): support fp32 training (#155 ) support float32 training * fix lint * add adaptation in model/utils.py * remove some unnecessary code * fix lint * feat(optim): add support for fp32 zero * Revert "Merge pull request #2 from SolenoidWGT/fp32_zero" This reverts commit `53fc50b0e5`, reversing changes made to `40f24d0a73`. revert commit * merge develop * Update utils.py * support fp32 in zero optimizer * modify the dtype --------- Co-authored-by: wangguoteng.p <wangguoteng925@qq.com>	2023-08-04 16:05:30 +08:00
ytxiong	5ee651c2f1	feat(): support not-flash-attn for pp and no-pp (#145 ) support not flash attention for no-pp * support pipeline * modify the config * refactor the code * refactor the code * remove some unnecessary code	2023-07-28 16:13:04 +08:00
ytxiong	fd398fae1a	refactor(rotaryEmbedding): refactor forward (#120 ) * use fp16 in instruction (#80) * delete torch_dtype of README's example code (#100) * refactor the forward for rotary embedding --------- Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com> Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com>	2023-07-25 15:25:48 +08:00
Sun Peng	fa7337b37b	initial commit	2023-07-06 12:55:23 +08:00

8 Commits (b46d1c17aff37d50f4365bbbbccf01505dce3598)