163 Commits (a8d459f99a1d415fc843327e4dafce19ecee1f3e)

Author SHA1 Message Date
Yuanheng Zhao 12e7c28d5e
[hotfix] fix OpenMOE example import path (#5697) 7 months ago
Yuanheng Zhao 55cc7f3df7
[Fix] Fix Inference Example, Tests, and Requirements (#5688) 7 months ago
Hongxin Liu 7f8b16635b
[misc] refactor launch API and tensor constructor (#5666) 7 months ago
Tong Li 68ec99e946
[hotfix] add soft link to support required files (#5661) 7 months ago
Hongxin Liu 1b387ca9fe
[shardformer] refactor pipeline grad ckpt config (#5646) 7 months ago
傅剑寒 279300dc5f
[Inference/Refactor] Refactor compilation mechanism and unified multi hw (#5613) 7 months ago
binmakeswell f4c5aafe29
[example] llama3 (#5631) 7 months ago
Hongxin Liu 4de4e31818
[exampe] update llama example (#5626) 7 months ago
Edenzzzz d83c633ca6
[hotfix] Fix examples no pad token & auto parallel codegen bug; (#5606) 7 months ago
Hongxin Liu 641b1ee71a
[devops] remove post commit ci (#5566) 8 months ago
digger yu 341263df48
[hotfix] fix typo s/get_defualt_parser /get_default_parser (#5548) 8 months ago
digger yu a799ca343b
[fix] fix typo s/muiti-node /multi-node etc. (#5448) 8 months ago
Wenhao Chen e614aa34f3
[shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508) 8 months ago
Yuanheng Zhao 36c4bb2893
[Fix] Grok-1 use tokenizer from the same pretrained path (#5532) 8 months ago
Insu Jang 00525f7772
[shardformer] fix pipeline forward error if custom layer distribution is used (#5189) 8 months ago
Yuanheng Zhao 131f32a076
[fix] fix grok-1 example typo (#5506) 8 months ago
binmakeswell 34e909256c
[release] grok-1 inference benchmark (#5500) 8 months ago
Wenhao Chen bb0a668fee
[hotfix] set return_outputs=False in examples and polish code (#5404) 8 months ago
Yuanheng Zhao 5fcd7795cd
[example] update Grok-1 inference (#5495) 8 months ago
binmakeswell 6df844b8c4
[release] grok-1 314b inference (#5490) 8 months ago
Hongxin Liu 848a574c26
[example] add grok-1 inference (#5485) 8 months ago
Luo Yihang e239cf9060
[hotfix] fix typo of openmoe model source (#5403) 9 months ago
Hongxin Liu 070df689e6
[devops] fix extention building (#5427) 9 months ago
flybird11111 29695cf70c
[example]add gpt2 benchmark example script. (#5295) 9 months ago
Hongxin Liu d882d18c65
[example] reuse flash attn patch (#5400) 9 months ago
digger yu 71321a07cf
fix typo change dosen't to doesn't (#5308) 10 months ago
flybird11111 f7e3f82a7e
fix llama pretrain (#5287) 10 months ago
Wenhao Chen ef4f0ee854
[hotfix]: add pp sanity check and fix mbs arg (#5268) 10 months ago
binmakeswell c174c4fc5f
[doc] fix doc typo (#5256) 11 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239) 11 months ago
Xuanlei Zhao dd2c28a323
[npu] use extension for op builder (#5172) 11 months ago
Wenhao Chen 3c0d82b19b
[pipeline]: support arbitrary batch size in forward_only mode (#5201) 11 months ago
Wenhao Chen 4fa689fca1
[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134) 11 months ago
flybird11111 21aa5de00b
[gemini] hotfix NaN loss while using Gemini + tensor_parallel (#5150) 12 months ago
binmakeswell 177c79f2d1
[doc] add moe news (#5128) 1 year ago
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088) 1 year ago
digger yu d5661f0f25
[nfc] fix typo change directoty to directory (#5111) 1 year ago
Xuanlei Zhao 3acbf6d496
[npu] add npu support for hybrid plugin and llama (#5090) 1 year ago
flybird11111 aae496631c
[shardformer]fix flash attention, when mask is casual, just don't unpad it (#5084) 1 year ago
github-actions[bot] 8921a73c90
[format] applied code formatting on changed files in pull request 5067 (#5072) 1 year ago
Hongxin Liu e5ce4c8ea6
[npu] add npu support for gemini and zero (#5067) 1 year ago
flybird11111 bc09b95f50
[exampe] fix llama example' loss error when using gemini plugin (#5060) 1 year ago
Elsa Granger b2ad0d9e8f
[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark (#5017) 1 year ago
Wenhao Chen 724441279b
[moe]: fix ep/tp tests, add hierarchical all2all (#4982) 1 year ago
Xuanlei Zhao f71e63b0f3
[moe] support optimizer checkpoint (#5015) 1 year ago
Xuanlei Zhao dc003c304c
[moe] merge moe into main (#4978) 1 year ago
Blagoy Simandoff 8aed02b957
[nfc] fix minor typo in README (#4846) 1 year ago
Baizhou Zhang df66741f77
[bug] fix get_default_parser in examples (#4764) 1 year ago
Wenhao Chen 7b9b86441f
[chat]: update rm, add wandb and fix bugs (#4471) 1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752) 1 year ago