You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ColossalAI/colossalai/shardformer/layer
hxwang 3e2b6132b7
[moe] clean legacy code
4 months ago
..
__init__.py [Feature] Enable PP + SP for llama (#5868) 5 months ago
_operation.py [ShardFormer] Add Ulysses Sequence Parallelism support for Command-R, Qwen2 and ChatGLM (#5897) 5 months ago
attn.py [shardformer] hotfix attn mask (#5947) 4 months ago
dropout.py [misc] update pre-commit and run all files (#4752) 1 year ago
embedding.py [Inference] Fix bugs and docs for feat/online-server (#5598) 7 months ago
linear.py [shardformer] refactor embedding resize (#5603) 7 months ago
loss.py [Feature] Enable PP + SP for llama (#5868) 5 months ago
normalization.py Remove CohereLayerNorm and use existing layernorm 5 months ago
parallel_module.py [shardformer] refactor embedding resize (#5603) 7 months ago
qkv_fused_linear.py Add n_fused as an input from native_module (#5894) 4 months ago
utils.py [shardformer] Sequence Parallelism Optimization (#5533) 8 months ago