You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
flybird11111
79718fae04
[shardformer] llama support DistCrossEntropy (#5176)
* fix
aaa
fix
fix
fix
* fix
* fix
* test ci
* fix ci
fix
* llama support dist-cross
fix
fix
fix
fix
fix
fix
fix
fix
* fix
* fix
* fix
fix
* test ci
* test ci
* fix
* [Colossal-Llama-2] Add finetuning Colossal-Llama-2 example (#4878)
* Add finetuning Colossal-Llama-2 example
* Add finetuning Colossal-Llama-2 example 2
* Add finetuning Colossal-Llama-2 example and support NEFTuning
* Add inference example and refine neftune
* Modify readme file
* update the imports
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
* llama support dist-cross
fix
fix
fix
fix
fix
fix
fix
fix
* fix
* fix
* fix
fix
* test ci
* test ci
* fix
* fix ci
* fix ci
---------
Co-authored-by: Yuanchen <70520919+chengeharrison@users.noreply.github.com>
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Camille Zhong <44392324+Camille7777@users.noreply.github.com>
|
12 months ago |
.. |
_C
|
…
|
|
_analyzer
|
[misc] update pre-commit and run all files (#4752)
|
1 year ago |
amp
|
[npu] add npu support for gemini and zero (#5067)
|
1 year ago |
auto_parallel
|
[npu] add npu support for gemini and zero (#5067)
|
1 year ago |
autochunk
|
[misc] update pre-commit and run all files (#4752)
|
1 year ago |
booster
|
[gemini] hotfix NaN loss while using Gemini + tensor_parallel (#5150)
|
1 year ago |
checkpoint_io
|
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark (#5017)
|
1 year ago |
cli
|
[bug] Fix the version check bug in colossalai run when generating the cmd. (#4713)
|
1 year ago |
cluster
|
[gemini] gemini support tensor parallelism. (#4942)
|
1 year ago |
context
|
[moe] merge moe into main (#4978)
|
1 year ago |
device
|
[npu] add npu support for hybrid plugin and llama (#5090)
|
1 year ago |
fx
|
[misc] update pre-commit and run all files (#4752)
|
1 year ago |
inference
|
[Hotfix] Fix model policy matching strategy in ShardFormer (#5064)
|
1 year ago |
interface
|
[lazy] support from_pretrained (#4801)
|
1 year ago |
kernel
|
fix thrust-transform-reduce error (#5078)
|
1 year ago |
lazy
|
[doc] add lazy init docs (#4808)
|
1 year ago |
legacy
|
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
|
1 year ago |
logging
|
[misc] update pre-commit and run all files (#4752)
|
1 year ago |
moe
|
[hotfix]: modify create_ep_hierarchical_group and add test (#5032)
|
1 year ago |
nn
|
[npu] add npu support for gemini and zero (#5067)
|
1 year ago |
pipeline
|
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
|
1 year ago |
shardformer
|
[shardformer] llama support DistCrossEntropy (#5176)
|
12 months ago |
tensor
|
fix (#5158)
|
1 year ago |
testing
|
[npu] add npu support for hybrid plugin and llama (#5090)
|
1 year ago |
utils
|
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
|
1 year ago |
zero
|
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
|
1 year ago |
__init__.py
|
[misc] update pre-commit and run all files (#4752)
|
1 year ago |
initialize.py
|
[npu] add npu support for gemini and zero (#5067)
|
1 year ago |