Hongxin Liu
|
19e1a5cf16
|
[shardformer] update colo attention to support custom mask (#5510)
* [feature] refactor colo attention (#5462)
* [extension] update api
* [feature] add colo attention
* [feature] update sdpa
* [feature] update npu attention
* [feature] update flash-attn
* [test] add flash attn test
* [test] update flash attn test
* [shardformer] update modeling to fit colo attention (#5465)
* [misc] refactor folder structure
* [shardformer] update llama flash-attn
* [shardformer] fix llama policy
* [devops] update tensornvme install
* [test] update llama test
* [shardformer] update colo attn kernel dispatch
* [shardformer] update blip2
* [shardformer] update chatglm
* [shardformer] update gpt2
* [shardformer] update gptj
* [shardformer] update opt
* [shardformer] update vit
* [shardformer] update colo attention mask prep
* [shardformer] update whisper
* [test] fix shardformer tests (#5514)
* [test] fix shardformer tests
* [test] fix shardformer tests
|
2024-03-27 11:19:32 +08:00 |
Frank Lee
|
7cfed5f076
|
[feat] refactored extension module (#5298)
* [feat] refactored extension module
* polish
* polish
* polish
* polish
* polish
* polish
* polish
* polish
* polish
* polish
|
2024-01-25 17:01:48 +08:00 |
Xuanlei Zhao
|
dd2c28a323
|
[npu] use extension for op builder (#5172)
* update extension
* update cpu adam
* update is
* add doc for cpu adam
* update kernel
* update commit
* update flash
* update memory efficient
* update flash attn
* update flash attention loader
* update api
* fix
* update doc
* update example time limit
* reverse change
* fix doc
* remove useless kernel
* fix
* not use warning
* update
* update
|
2024-01-08 11:39:16 +08:00 |
flybird11111
|
aae496631c
|
[shardformer]fix flash attention, when mask is casual, just don't unpad it (#5084)
* fix flash attn
* fix
fix
|
2023-11-22 16:00:07 +08:00 |
Hongxin Liu
|
079bf3cb26
|
[misc] update pre-commit and run all files (#4752)
* [misc] update pre-commit
* [misc] run pre-commit
* [misc] remove useless configuration files
* [misc] ignore cuda for clang-format
|
2023-09-19 14:20:26 +08:00 |
flybird11111
|
59e252ecdb
|
[shardformer] chatglm support sequence parallel (#4482)
* [shardformer] chatglm support sequence parallel
[shardformer] chatglm support sequence parallel
[shardformer] chatglm support sequence parallel
[shardformer] chatglm support sequence parallel
[shardformer] chatglm support sequence parallel
[shardformer] chatglm support sequence parallel
* fix
fix
fix
fix
|
2023-08-22 23:59:31 +08:00 |
Jianghai
|
5545114fd8
|
rename chatglm to chatglm2 (#4484)
|
2023-08-22 14:13:31 +08:00 |