Commit Graph

54 Commits (184a65370451452bae87d0058bba06563028c4a8)

Author SHA1 Message Date
Season 7ef91606e1
[Fix]: implement thread-safety singleton to avoid deadlock for very large-scale training scenarios (#5625)
7 months ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743)
1 year ago
Hongxin Liu ac178ca5c1 [legacy] move builder and registry to legacy (#4603)
1 year ago
digger-yu b7141c36dd
[CI] fix some spelling errors (#3707)
2 years ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
2 years ago
yuxuan-lou 198a74b9fd
[NFC] polish colossalai/context/random/__init__.py code style (#3327)
2 years ago
RichardoLuo 1ce9d0c531 [NFC] polish initializer_data.py code style (#3287)
2 years ago
Kai Wang (Victor Kai) 964a28678f [NFC] polish initializer_3d.py code style (#3279)
2 years ago
Arsmart1 8af977f223 [NFC] polish colossalai/context/parallel_context.py code style (#3276)
2 years ago
Zirui Zhu c9e3ee389e
[NFC] polish colossalai/context/process_group_initializer/initializer_2d.py code style (#2726)
2 years ago
Ziyue Jiang 4603538ddd
[NFC] posh colossalai/context/process_group_initializer/initializer_sequence.py code style (#2712)
2 years ago
アマデウス 534f68c83c
[NFC] polish pipeline process group code style (#2694)
2 years ago
LuGY 56ff1921e9
[NFC] polish colossalai/context/moe_context.py code style (#2693)
2 years ago
アマデウス 99d9713b02 Revert "Update parallel_context.py (#2408)"
2 years ago
Haofan Wang 7d5640b9db
Update parallel_context.py (#2408)
2 years ago
Tongping Liu 8e22c38b89
[hotfix] Fixing the bug related to ipv6 support
2 years ago
kurisusnowdeng 0b8161fab8 updated tp layers
2 years ago
HELSON 1468e4bcfc
[zero] add constant placement policy (#1705)
2 years ago
HELSON 95c35f73bd
[moe] initialize MoE groups by ProcessGroup (#1640)
2 years ago
Frank Lee 27fe8af60c
[autoparallel] refactored shape consistency to remove redundancy (#1591)
2 years ago
ver217 d068af81a3
[doc] update rst and docstring (#1351)
2 years ago
Frank Lee 2238758c2e
[usability] improved error messages in the context module (#856)
3 years ago
Frank Lee 920fe31526
[compatibility] used backward-compatible API for global process group (#758)
3 years ago
Frank Lee 04ff5ea546
[utils] support detection of number of processes on current node (#723)
3 years ago
Cautiousss 055d0270c8 [NFC] polish colossalai/context/process_group_initializer/initializer_sequence.py colossalai/context/process_group_initializer initializer_tensor.py code style (#639)
3 years ago
Jiang Zhuo 0a96338b13 [NFC] polish <colossalai/context/process_group_initializer/initializer_data.py> code stype (#626)
3 years ago
ziyu huang 701bad439b [NFC] polish colossalai/context/process_group_initializer/process_group_initializer.py code stype (#617)
3 years ago
アマデウス 297b8baae2
[model checkpoint] add gloo groups for cpu tensor communication (#589)
3 years ago
Liang Bowen 2c45efc398
html refactor (#555)
3 years ago
Liang Bowen ec5086c49c Refactored docstring to google style
3 years ago
Jiarui Fang a445e118cf
[polish] polish singleton and global context (#500)
3 years ago
HELSON f24b5ed201
[MOE] remove old MoE legacy (#493)
3 years ago
Jiarui Fang 65c0f380c2
[format] polish name format for MOE (#481)
3 years ago
HELSON 7544347145
[MOE] add unitest for MOE experts layout, gradient handler and kernel (#469)
3 years ago
HELSON 84fd7c1d4d
add moe context, moe utilities and refactor gradient handler (#455)
3 years ago
Frank Lee b72b8445c6
optimized context test time consumption (#446)
3 years ago
Frank Lee 1e4bf85cdb fixed bug in activation checkpointing test (#387)
3 years ago
RichardoLuo 8539898ec6 flake8 style change (#363)
3 years ago
ziyu huang a77d73f22b fix format parallel_context.py (#359)
3 years ago
Maruyama_Aya e83970e3dc fix format ColossalAI\colossalai\context\process_group_initializer
3 years ago
アマデウス 9ee197d0e9 moved env variables to global variables; (#215)
3 years ago
HELSON 0f8c7f9804
Fixed docstring in colossalai (#171)
3 years ago
Frank Lee e2089c5c15
adapted for sequence parallel (#163)
3 years ago
HELSON dceae85195
Added MoE parallel (#127)
3 years ago
ver217 a951bc6089
update default logger (#100) (#101)
3 years ago
ver217 96780e6ee4
Optimize pipeline schedule (#94)
3 years ago
アマデウス 01a80cd86d
Hotfix/Colossalai layers (#92)
3 years ago
アマデウス 0fedef4f3c
Layer integration (#83)
3 years ago