Making large AI models cheaper, faster and more accessible
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
GuangyaoZhang 457a0de79f shardformer fp8 4 months ago
..
moe [MoE/ZeRO] Moe refactor with zero refactor (#5821) 5 months ago
__init__.py Remove CohereLayerNorm and use existing layernorm 5 months ago
_operation.py shardformer fp8 4 months ago
attn.py [Inference] Fix flash-attn import and add model test (#5794) 5 months ago
dropout.py [misc] update pre-commit and run all files (#4752) 1 year ago
embedding.py [Inference] Fix bugs and docs for feat/online-server (#5598) 7 months ago
linear.py shardformer fp8 4 months ago
loss.py [Shardformer] Add parallel output for shardformer models(bloom, falcon) (#5702) 6 months ago
normalization.py Remove CohereLayerNorm and use existing layernorm 5 months ago
parallel_module.py [shardformer] refactor embedding resize (#5603) 7 months ago
qkv_fused_linear.py shardformer fp8 4 months ago
utils.py [shardformer] Sequence Parallelism Optimization (#5533) 8 months ago