Tong Li
1d96a562bb
update
2024-01-11 14:05:44 +08:00
Tong Li
dac240563c
minor update
2024-01-10 11:12:09 +08:00
Tong Li
ea088b5f75
update train code
2024-01-10 10:42:37 +08:00
Tong Li
4b7f273022
add moe
2024-01-09 11:59:38 +08:00
ver217
63ee6fffe6
Merge branch 'main' into exp/mixtral
2024-01-08 16:43:54 +08:00
ver217
ce1cff26bd
Merge branch 'main' into exp/mixtral
2024-01-08 16:42:00 +08:00
Elsa Granger
d565df3821
[pipeline] A more general _communicate in p2p ( #5062 )
...
* A more general _communicate
* feat: finish tree_flatten version p2p
* fix: update p2p api calls
---------
Co-authored-by: Wenhao Chen <cwher@outlook.com>
2024-01-08 15:37:27 +08:00
binmakeswell
7bc6969ce6
[doc] SwiftInfer release ( #5236 )
...
* [doc] SwiftInfer release
* [doc] SwiftInfer release
* [doc] SwiftInfer release
* [doc] SwiftInfer release
* [doc] SwiftInfer release
2024-01-08 09:55:12 +08:00
github-actions[bot]
4fb4a22a72
[format] applied code formatting on changed files in pull request 5234 ( #5235 )
...
Co-authored-by: github-actions <github-actions@github.com>
2024-01-07 20:55:34 +08:00
binmakeswell
b9b32b15e6
[doc] add Colossal-LLaMA-2-13B ( #5234 )
...
* [doc] add Colossal-LLaMA-2-13B
* [doc] add Colossal-LLaMA-2-13B
* [doc] add Colossal-LLaMA-2-13B
2024-01-07 20:53:12 +08:00
JIMMY ZHAO
ce651270f1
[doc] Make leaderboard format more uniform and good-looking ( #5231 )
...
* Make leaderboard format more unifeid and good-looking
* Update README.md
* Update README.md
2024-01-06 17:12:29 +08:00
Camille Zhong
915b4652f3
[doc] Update README.md of Colossal-LLAMA2 ( #5233 )
...
* Update README.md
* Update README.md
2024-01-06 17:06:41 +08:00
Tong Li
d992b55968
[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model ( #5224 )
...
* update readme
* update readme
* update link
* update
* update readme
* update
* update
* update
* update title
* update example
* update example
* fix content
* add conclusion
* add license
* update
* update
* update version
* fix minor
2024-01-05 17:24:26 +08:00
Wenhao Chen
196b85368b
[pipeline]: add p2p fallback order and fix interleaved pp deadlock ( #5214 )
...
* fix: add fallback order option and update 1f1b
* fix: fix deadlock comm in interleaved pp
* test: modify p2p test
2024-01-05 14:01:54 +08:00
Wenhao Chen
931d0e0731
[pipeline]: support arbitrary batch size in forward_only mode ( #5201 )
...
* fix: remove drop last in val & test dataloader
* feat: add run_forward_only, support arbitrary bs
* chore: modify ci script
2024-01-05 14:01:39 +08:00
Wenhao Chen
1810b9100f
[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp ( #5134 )
...
* test: add more p2p tests
* fix: remove send_forward_recv_forward as p2p op list need to use the same group
* fix: make send and receive atomic
* feat: update P2PComm fn
* feat: add metadata cache in 1f1b
* feat: add metadata cache in interleaved pp
* feat: modify is_xx_stage fn
* revert: add _broadcast_object_list
* feat: add interleaved pp in llama policy
* feat: set NCCL_BUFFSIZE in HybridParallelPlugin
2024-01-05 13:58:53 +08:00
digger yu
b0b53a171c
[nfc] fix typo colossalai/shardformer/ ( #5133 )
2024-01-04 16:21:55 +08:00
Xuanlei Zhao
6b69f3085b
update
2024-01-03 15:37:59 +08:00
flybird11111
451e9142b8
fix flash attn ( #5209 )
2024-01-03 14:39:53 +08:00
flybird11111
365671be10
fix-test ( #5210 )
...
fix-test
fix-test
2024-01-03 14:26:13 +08:00
Xuanlei Zhao
8ca8cf8ec3
update optim
2024-01-03 11:57:23 +08:00
Hongxin Liu
7f3400b560
[devops] update torch versoin in ci ( #5217 )
2024-01-03 11:46:33 +08:00
Wenhao Chen
d799a3088f
[pipeline]: add p2p fallback order and fix interleaved pp deadlock ( #5214 )
...
* fix: add fallback order option and update 1f1b
* fix: fix deadlock comm in interleaved pp
* test: modify p2p test
2024-01-03 11:34:49 +08:00
Wenhao Chen
3c0d82b19b
[pipeline]: support arbitrary batch size in forward_only mode ( #5201 )
...
* fix: remove drop last in val & test dataloader
* feat: add run_forward_only, support arbitrary bs
* chore: modify ci script
2024-01-02 23:41:12 +08:00
Xuanlei Zhao
f037583bd2
update train
2024-01-02 14:01:58 +08:00
flybird11111
02d2328a04
support linear accumulation fusion ( #5199 )
...
support linear accumulation fusion
support linear accumulation fusion
fix
2023-12-29 18:22:42 +08:00
Xuanlei Zhao
0b8c33f474
update
2023-12-29 18:20:32 +08:00
Xuanlei Zhao
c1c6af6368
update
2023-12-29 18:09:28 +08:00
Xuanlei Zhao
0bb317d9e6
update
2023-12-29 17:28:46 +08:00
Xuanlei Zhao
ccad7014c6
update optim
2023-12-29 16:51:29 +08:00
Xuanlei Zhao
44014faa67
fix optim
2023-12-28 21:58:08 +08:00
Xuanlei Zhao
0a3aae509b
update utils and fwd bwd
2023-12-28 18:54:56 +08:00
Xuanlei Zhao
a5580e6289
update test
2023-12-28 18:52:37 +08:00
Xuanlei Zhao
73aa406b96
update
2023-12-28 15:48:04 +08:00
Zhongkai Zhao
64519eb830
[doc] Update required third-party library list for testing and torch comptibility checking ( #5207 )
...
* doc/update requirements-test.txt
* update torch-cuda compatibility check
2023-12-27 18:03:45 +08:00
Xuanlei Zhao
570f5cd693
update pytest
2023-12-27 16:05:00 +08:00
Xuanlei Zhao
54b197cc02
update readme
2023-12-26 17:39:38 +08:00
Xuanlei Zhao
4922641098
script
2023-12-26 17:33:32 +08:00
Xuanlei Zhao
d660a41850
update
2023-12-26 17:32:59 +08:00
Xuanlei Zhao
b8fadb68a7
add pad
2023-12-25 17:02:05 +08:00
Xuanlei Zhao
23341687ed
update
2023-12-25 16:29:47 +08:00
Xuanlei Zhao
aa2e091dc6
update
2023-12-25 16:05:42 +08:00
Yuanchen
eae01b6740
Improve logic for selecting metrics ( #5196 )
...
Co-authored-by: Xu <yuanchen.xu00@gmail.com>
2023-12-22 14:52:50 +08:00
Wenhao Chen
4fa689fca1
[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp ( #5134 )
...
* test: add more p2p tests
* fix: remove send_forward_recv_forward as p2p op list need to use the same group
* fix: make send and receive atomic
* feat: update P2PComm fn
* feat: add metadata cache in 1f1b
* feat: add metadata cache in interleaved pp
* feat: modify is_xx_stage fn
* revert: add _broadcast_object_list
* feat: add interleaved pp in llama policy
* feat: set NCCL_BUFFSIZE in HybridParallelPlugin
2023-12-22 10:44:00 +08:00
BlueRum
af952673f7
polish readme in application/chat ( #5194 )
2023-12-20 11:28:39 +08:00
Xuanlei Zhao
7c5b1a585f
update
2023-12-18 10:37:07 +08:00
flybird11111
681d9b12ef
[doc] update pytorch version in documents. ( #5177 )
...
* fix
aaa
fix
fix
fix
* fix
* fix
* test ci
* fix ci
fix
* update pytorch version in documents
2023-12-15 18:16:48 +08:00
Xuanlei Zhao
ebd8cc579a
update script
2023-12-15 16:38:51 +08:00
Xuanlei Zhao
f66469e209
update
2023-12-15 16:32:32 +08:00
Yuanchen
3ff60d13b0
Fix ColossalEval ( #5186 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-12-15 15:06:06 +08:00