Dongruixuan Li
a7ae2b5b4c
[eval-hotfix] set few_shot_data to None when few shot is disabled ( #5422 )
9 months ago
Camille Zhong
a5756a8720
[eval] update llama npu eval ( #5366 )
10 months ago
digger yu
756c400ad2
fix typo in applications/ColossalEval/README.md ( #5250 )
11 months ago
Tong Li
d992b55968
[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model ( #5224 )
...
* update readme
* update readme
* update link
* update
* update readme
* update
* update
* update
* update title
* update example
* update example
* fix content
* add conclusion
* add license
* update
* update
* update version
* fix minor
11 months ago
Yuanchen
eae01b6740
Improve logic for selecting metrics ( #5196 )
...
Co-authored-by: Xu <yuanchen.xu00@gmail.com>
11 months ago
Yuanchen
3ff60d13b0
Fix ColossalEval ( #5186 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
12 months ago
Yuanchen
cefdc32615
[ColossalEval] Support GSM, Data Leakage Evaluation and Tensor Parallel ( #5169 )
...
* Support GSM, Data Leakage Evaluation and Tensor Parallel
* remove redundant code and update inference.py in examples/gpt_evaluation
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
12 months ago
github-actions[bot]
f6731db67c
[format] applied code formatting on changed files in pull request 5115 ( #5118 )
...
Co-authored-by: github-actions <github-actions@github.com>
1 year ago
digger yu
9110406a47
fix typo change JOSNL TO JSONL etc. ( #5116 )
1 year ago
Zian(Andy) Zheng
7b789f4dd2
[FEATURE] Add Safety Eval Datasets to ColossalEval ( #5095 )
...
* add safetybench and cvalues(responsibility) eval dataset
* Modify code according to review suggestions
---------
Co-authored-by: Orion-Zheng <zhengzian@u.nus.edu>
1 year ago
Yuanchen
239cd92eff
Support mtbench ( #5025 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
1 year ago
Yuanchen
abe071b663
fix ColossalEval ( #4992 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
1 year ago
Yuanchen
1fa8c5e09f
Update Qwen-7B results ( #4821 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
1 year ago
Yuanchen
ce777853ae
[feature] ColossalEval: Evaluation Pipeline for LLMs ( #4786 )
...
* Add ColossalEval
* Delete evaluate in Chat
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
1 year ago