Commit Graph

5 Commits (19d1510ea26d10484a804eb62f6d03dbcc7c80a8)

Author SHA1 Message Date
Li Xingjian 8554585a5f
[Inference] Fix flash-attn import and add model test (#5794)
6 months ago
char-1ee f5981e808e Remove flash attention backend
6 months ago
char-1ee 5f398fc000 Pass inference model shard configs for module init
6 months ago
char-1ee eec77e5702 Fix tests and naming
6 months ago
char-1ee 04386d9eff Refactor modeling by adding attention backend
6 months ago