ColossalAI/colossalai/inference/modeling/layers
yuehuayingxueluo 35382a7fbf
[Inference]Fused the gate and up proj in mlp,and optimized the autograd process. (#5365)
* fused the gate and up proj in mlp

* fix code styles

* opt auto_grad

* rollback test_inference_engine.py

* modifications based on the review feedback.

* fix bugs in flash attn

* Change reshape to view

* fix test_rmsnorm_triton.py
2024-02-06 19:38:25 +08:00
..
__init__.py [doc] updated inference readme (#5343) 2024-02-02 14:31:10 +08:00
attention.py [Inference]Fused the gate and up proj in mlp,and optimized the autograd process. (#5365) 2024-02-06 19:38:25 +08:00