You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Elsa Granger
b2ad0d9e8f
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark ( #5017 )
...
* Use p2p
* Cannot bidirectonal send p2p
* Refactor tensor creation and serialization in P2P
communication
* Fix llama forward args in flash attention
* Add flop estimate from megatron
* Support loading weight not in weight_map when strict=False in hybrid_parallel
* Use send_forward_recv_backward, etc in 1f1b
* Use dataclass for metdata
Remove torch.cuda.synchronize() as suggested
* Add comment about the torch.cuda.synchronize for potential error
* Typo
* Update hybrid_parallel_checkpoint_io.py
* Update p2p.py
* Update one_f_one_b.py
* Update p2p.py
---------
Co-authored-by: flybird11111 <1829166702@qq.com>
1 year ago
..
_C
[setup] support pre-build and jit-build of cuda kernels ( #2374 )
2 years ago
_analyzer
[misc] update pre-commit and run all files ( #4752 )
1 year ago
amp
[feature] Add clip_grad_norm for hybrid_parallel_plugin ( #4837 )
1 year ago
auto_parallel
[misc] update pre-commit and run all files ( #4752 )
1 year ago
autochunk
[misc] update pre-commit and run all files ( #4752 )
1 year ago
booster
[gemini] gemini support tensor parallelism. ( #4942 )
1 year ago
checkpoint_io
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark ( #5017 )
1 year ago
cli
[bug] Fix the version check bug in colossalai run when generating the cmd. ( #4713 )
1 year ago
cluster
[gemini] gemini support tensor parallelism. ( #4942 )
1 year ago
context
[moe] merge moe into main ( #4978 )
1 year ago
device
[misc] update pre-commit and run all files ( #4752 )
1 year ago
fx
[misc] update pre-commit and run all files ( #4752 )
1 year ago
inference
[Kernels]Update triton kernels into 2.1.0 ( #5046 )
1 year ago
interface
[lazy] support from_pretrained ( #4801 )
1 year ago
kernel
[Kernels]Update triton kernels into 2.1.0 ( #5046 )
1 year ago
lazy
[doc] add lazy init docs ( #4808 )
1 year ago
legacy
[moe] merge moe into main ( #4978 )
1 year ago
logging
[misc] update pre-commit and run all files ( #4752 )
1 year ago
moe
[moe]: fix ep/tp tests, add hierarchical all2all ( #4982 )
1 year ago
nn
[moe] merge moe into main ( #4978 )
1 year ago
pipeline
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark ( #5017 )
1 year ago
shardformer
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark ( #5017 )
1 year ago
tensor
[gemini] gemini support tensor parallelism. ( #4942 )
1 year ago
testing
[test] merge old components to test to model zoo ( #4945 )
1 year ago
utils
[moe] merge moe into main ( #4978 )
1 year ago
zero
[gemini] gemini support tensor parallelism. ( #4942 )
1 year ago
__init__.py
[misc] update pre-commit and run all files ( #4752 )
1 year ago
initialize.py
[misc] update pre-commit and run all files ( #4752 )
1 year ago