Commit Graph

12 Commits (544b7a38a167cb05cdc7590cfc100e23c0ed5ab7)

Author SHA1 Message Date
pre-commit-ci[bot] 7c2f79fa98
[pre-commit.ci] pre-commit autoupdate (#5572)
* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/PyCQA/autoflake: v2.2.1 → v2.3.1](https://github.com/PyCQA/autoflake/compare/v2.2.1...v2.3.1)
- [github.com/pycqa/isort: 5.12.0 → 5.13.2](https://github.com/pycqa/isort/compare/5.12.0...5.13.2)
- [github.com/psf/black-pre-commit-mirror: 23.9.1 → 24.4.2](https://github.com/psf/black-pre-commit-mirror/compare/23.9.1...24.4.2)
- [github.com/pre-commit/mirrors-clang-format: v13.0.1 → v18.1.7](https://github.com/pre-commit/mirrors-clang-format/compare/v13.0.1...v18.1.7)
- [github.com/pre-commit/pre-commit-hooks: v4.3.0 → v4.6.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.3.0...v4.6.0)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-07-01 17:16:41 +08:00
digger yu a799ca343b
[fix] fix typo s/muiti-node /multi-node etc. (#5448) 2024-04-07 18:42:15 +08:00
Dongruixuan Li a7ae2b5b4c
[eval-hotfix] set few_shot_data to None when few shot is disabled (#5422) 2024-03-05 21:48:55 +08:00
Camille Zhong a5756a8720
[eval] update llama npu eval (#5366) 2024-02-06 10:53:03 +08:00
Yuanchen eae01b6740
Improve logic for selecting metrics (#5196)
Co-authored-by: Xu <yuanchen.xu00@gmail.com>
2023-12-22 14:52:50 +08:00
Yuanchen 3ff60d13b0
Fix ColossalEval (#5186)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-12-15 15:06:06 +08:00
Yuanchen cefdc32615
[ColossalEval] Support GSM, Data Leakage Evaluation and Tensor Parallel (#5169)
* Support GSM, Data Leakage Evaluation and Tensor Parallel

* remove redundant code and update inference.py in examples/gpt_evaluation

---------

Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-12-12 14:47:35 +08:00
github-actions[bot] f6731db67c
[format] applied code formatting on changed files in pull request 5115 (#5118)
Co-authored-by: github-actions <github-actions@github.com>
2023-11-29 13:39:14 +08:00
Zian(Andy) Zheng 7b789f4dd2 [FEATURE] Add Safety Eval Datasets to ColossalEval (#5095)
* add safetybench and cvalues(responsibility) eval dataset

* Modify code according to review suggestions

---------

Co-authored-by: Orion-Zheng <zhengzian@u.nus.edu>
2023-11-28 11:15:04 +08:00
Yuanchen 239cd92eff
Support mtbench (#5025)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-11-09 13:41:50 +08:00
Yuanchen abe071b663
fix ColossalEval (#4992)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-10-31 10:30:03 +08:00
Yuanchen ce777853ae
[feature] ColossalEval: Evaluation Pipeline for LLMs (#4786)
* Add ColossalEval

* Delete evaluate in Chat

---------

Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
2023-09-24 23:14:11 +08:00