You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ColossalAI/colossalai/shardformer/layer
flybird11111 02d2328a04
support linear accumulation fusion (#5199)
11 months ago
..
__init__.py [hotfix] Add layer norm gradients all-reduce for sequence parallel (#4926) 1 year ago
_operation.py support linear accumulation fusion (#5199) 11 months ago
dropout.py [misc] update pre-commit and run all files (#4752) 1 year ago
embedding.py [gemini] gemini support tensor parallelism. (#4942) 1 year ago
linear.py support linear accumulation fusion (#5199) 11 months ago
loss.py [shardformer] llama support DistCrossEntropy (#5176) 12 months ago
normalization.py [shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088) 1 year ago
parallel_module.py [misc] update pre-commit and run all files (#4752) 1 year ago
qkv_fused_linear.py [misc] update pre-commit and run all files (#4752) 1 year ago
utils.py [npu] add npu support for hybrid plugin and llama (#5090) 1 year ago