* add fused precision support for norm
* refactor code
* refactor code
* change the granularity of hook
* fix bugs if self.model is ModuleList
* add dtype condition for post hook
* refactor code for split group
* refactor code for pre/post hook
* refactor code for split group
* remove fp32 hook for norm
* unit tests for fused precision
* add doc for fused precision
* add doc for En. version
* reformat docs
* Update mixed_precision.rst
* Update mixed_precision.po
* update mixed_precision.po
* add long text generation in doc/usage.md
* add long text generation in doc/usage.md
* add long text generation in doc/usage.md
---------
Co-authored-by: YWMditto <862779238@qq.com>
* feat:add numa
* feat:add bind numa
* feat:add bind numa
* feat:add bind numa
* feat: bind numa
* feat: bind numa
* feat: add numa
* feat:add numa
* feat:add numa
* try_bind_numa should not raise exception
---------
Co-authored-by: 877825076@qq.com <877825076@qq.com>
* fix(storage): fix try_get_storage_backend
* fix typo and print infos only in log rank
* fix typo and print infos only in log rank
---------
Co-authored-by: gaoyang07 <Gary1546308416AL@gmail.com>
* feat(.github/workflows/e2e_test.yaml): update e2e yaml
* feat(.github/workflows/e2e_test.yaml): update e2e yaml
* test e2e
* test e2e
* test e2e
* test e2e
* test e2e
* fix(ci): test ci
* fix(ci): test ci
* fix(ci): test ci
* fix(ci): test ci
* fix(ci): test ci
* fix(ci): add weekly tests
---------
Co-authored-by: huangting4201 <huangting3@sensetime.com>
* tests(test_training): add test case for loss accuracy
* tests(test_training): update test cases
* ci(.github/workflows/e2e_test.yaml): remove pull submodule
* ci(.github/workflows/e2e_test.yaml): update ci env and remove useless env var
* test(tests/test_training): add 16 GPUs test cases
* test(tests/test_training): fix training_16GPU_8DP2PP test case error
* test(tests/test_training): add new case for interleaved pp
* test(tests/test_training): remove redundant code
* test(tests/test_training): update ci job timeout minutes to 30m
* feat(initialize/launch.py): check num_chunks and interleaved_overlap
---------
Co-authored-by: huangting4201 <huangting3@sensetime.com>
* fix(chat): fix stream_chat to return generator (#123)
* fix(configs/7B_sft.py): model dtype float16 to bfloat16 (#302)
* fix(convert2hf.py): fix the rotary_emb.inv_freq KeyError (#299)
* support openai api to deploy internlm
* update README for information os openai_api.py
* change example in README_EN.md to English
* delete unnecessary print; fix model card typo; fix chat epoch
---------
Co-authored-by: yingtongxiong <974106207@qq.com>
Co-authored-by: zhjunqin <zhjunqin@users.noreply.github.com>
Co-authored-by: huangting4201 <1538303371@qq.com>
Co-authored-by: jiangtann <39088437+jiangtann@users.noreply.github.com>
* add training image for docs
* docs(doc/code-docs): add training img for en doc
* docs(doc/code-docs): fix en docs for initialize
* docs(doc/code-docs): update conf file for readthedocs
* docs(doc/code-docs): fix typos
* docs(doc/code-docs): fix typos for reathedocs
* docs(doc/code-docs): minor typo fix for reathedocs
* docs(doc/code-docs): fix readthedocs conf file
* docs(doc/code-docs): update training image
* docs(doc/code-docs): fix typos
* docs(doc/code-docs): update training image
* docs(doc/code-docs): move training image to section initialize
* docs(doc/code-docs): fix lint
* add badge about reathedocs status
* fix/brocast should not in commu stream
* fix/brocast should not in commu stream
* feat: support allreduce grad using async op
* fix bug of async op
* use reduceop.avg
* use torch flat
* delete unused stream
* delete unused stream
* feat: overap allreduce with memcapy
---------
Co-authored-by: yingtongxiong <974106207@qq.com>
* feat(code-docs): test auto doc
* feat(code-docs): test auto doc
* feat(code-docs): test auto doc
* feat(code-docs): test auto doc
* docs(doc/code-docs): add zh_CN structure
* docs(doc/code-docs): test install.md
* docs(doc/code-docs): source file to zh
* docs(doc/code-docs): update source files
* docs(doc/code-docs): add locales en
* docs(doc/code-docs): add locales en install
* docs(doc/code-docs): add locales en example
* docs(doc/code-docs): update en checkpoint
* add en translation for parallel.rst docs
* add en translation for profiler.po docs
* docs(doc/code-docs): update en monitor
* add en translation for monuitor, qa, training docs
* add en translation for quickstart docs
* docs(doc/code-docs): update monitor.po and usage.po
* docs(doc/code-docs): fix typos
* docs(doc/code-docs): update en parallel
* docs(doc/code-docs): update en parallel
* docs(doc/code-docs): update en usage
* docs(doc/code-docs): update en profilier
* docs(doc/code-docs): update en initialize
* docs(doc/code-docs): update en initialize
* docs(doc/code-docs): update en initialize
* docs(doc/code-docs): update en initialize
---------
Co-authored-by: zigzagcai <caizheng@pjlab.org.cn>