3582 Commits (f5c84af0b01bcd2e993d38dc628793f7f0a8ba64)
 

Author SHA1 Message Date
haze188 5ed5e8cfba solve hang when parallel mode = pp + dp 4 months ago
haze188 fe24789eb1 [misc] solve booster hang by rename the variable 4 months ago
botbw 13b48ac0aa [zero] solve hang 4 months ago
botbw b5bfeb2efd [moe] implement transit between non moe tp and ep 4 months ago
botbw 37443cc7e4 [test] pass mixtral shardformer test 4 months ago
hxwang 46c069b0db [zero] solve hang 4 months ago
hxwang 0fad23c691 [chore] handle non member group 4 months ago
hxwang a249e71946 [test] mixtra pp shard test 4 months ago
hxwang 8ae8525bdf [moe] fix plugin 4 months ago
hxwang 0b76b57cd6 [test] add mixtral transformer test 4 months ago
hxwang f9b6fcf81f [test] add mixtral for sequence classification 4 months ago
Tong Li 1aeb5e8847
[hotfix] Remove unused plan section (#5957) 4 months ago
YeAnbang 66fbf2ecb7
Update README.md (#5958) 4 months ago
YeAnbang 30f4e31a33
[Chat] Fix lora (#5946) 4 months ago
Hongxin Liu 09c5f72595
[release] update version (#5952) 4 months ago
Hongxin Liu 060892162a
[zero] hotfix update master params (#5951) 4 months ago
Runyu Lu bcf0181ecd
[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895) 4 months ago
Hongxin Liu 7b38964e3a
[shardformer] hotfix attn mask (#5947) 4 months ago
Hongxin Liu 9664b1bc19
[shardformer] hotfix attn mask (#5945) 4 months ago
YeAnbang c8332b9cb5
Merge pull request #5922 from hpcaitech/kto 4 months ago
YeAnbang 6fd9e86864 fix style 4 months ago
YeAnbang de1bf08ed0 fix style 4 months ago
YeAnbang 8a3ff4f315 fix style 4 months ago
zhurunhua ad35a987d3
[Feature] Add a switch to control whether the model checkpoint needs to be saved after each epoch ends (#5941) 4 months ago
Edenzzzz 2069472e96
[Hotfix] Fix ZeRO typo #5936 4 months ago
Gao, Ruiyuan 5fb958cc83
[FIX BUG] convert env param to int in (#5934) 4 months ago
Insu Jang a521ffc9f8
Add n_fused as an input from native_module (#5894) 4 months ago
YeAnbang 9688e19b32 remove real data path 4 months ago
YeAnbang b0e15d563e remove real data path 4 months ago
YeAnbang 12fe8b5858 refactor evaluation 4 months ago
YeAnbang c5f582f666 fix test data 4 months ago
zhurunhua 4ec17a7cdf
[FIX BUG] UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value (#5931) 4 months ago
YeAnbang 150505cbb8 Merge branch 'kto' of https://github.com/hpcaitech/ColossalAI into kto 4 months ago
YeAnbang d49550fb49 refactor tokenization 4 months ago
Tong Li d08c99be0d
Merge branch 'main' into kto 4 months ago
Tong Li f585d4e38e
[ColossalChat] Hotfix for ColossalChat (#5910) 4 months ago
Edenzzzz 8cc8f645cd
[Examples] Add lazy init to OPT and GPT examples (#5924) 4 months ago
YeAnbang 544b7a38a1 fix style, add kto data sample 4 months ago
YeAnbang 845ea7214e Merge branch 'main' of https://github.com/hpcaitech/ColossalAI into kto 4 months ago
YeAnbang 09d5ffca1a add kto 4 months ago
Hongxin Liu e86127925a
[plugin] support all-gather overlap for hybrid parallel (#5919) 4 months ago
Hongxin Liu 73494de577
[release] update version (#5912) 4 months ago
Hongxin Liu 27a72f0de1 [misc] support torch2.3 (#5893) 4 months ago
アマデウス 530283dba0 fix object_to_tensor usage when torch>=2.3.0 (#5820) 4 months ago
Guangyao Zhang 2e28c793ce [compatibility] support torch 2.2 (#5875) 4 months ago
YeAnbang d8bf7e09a2
Merge pull request #5901 from hpcaitech/colossalchat 4 months ago
Guangyao Zhang 1c961b20f3
[ShardFormer] fix qwen2 sp (#5903) 4 months ago
Stephan Kö 45c49dde96
[Auto Parallel]: Speed up intra-op plan generation by 44% (#5446) 4 months ago
YeAnbang b3594d4d68 fix orpo cross entropy loss 4 months ago
Hongxin Liu c068ef0fa0
[zero] support all-gather overlap (#5898) 4 months ago