Yuanchen
eae01b6740
Improve logic for selecting metrics ( #5196 )
...
Co-authored-by: Xu <yuanchen.xu00@gmail.com>
2023-12-22 14:52:50 +08:00
Yuanchen
3ff60d13b0
Fix ColossalEval ( #5186 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-12-15 15:06:06 +08:00
Yuanchen
cefdc32615
[ColossalEval] Support GSM, Data Leakage Evaluation and Tensor Parallel ( #5169 )
...
* Support GSM, Data Leakage Evaluation and Tensor Parallel
* remove redundant code and update inference.py in examples/gpt_evaluation
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-12-12 14:47:35 +08:00
Zian(Andy) Zheng
7b789f4dd2
[FEATURE] Add Safety Eval Datasets to ColossalEval ( #5095 )
...
* add safetybench and cvalues(responsibility) eval dataset
* Modify code according to review suggestions
---------
Co-authored-by: Orion-Zheng <zhengzian@u.nus.edu>
2023-11-28 11:15:04 +08:00
Yuanchen
239cd92eff
Support mtbench ( #5025 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-11-09 13:41:50 +08:00
Yuanchen
abe071b663
fix ColossalEval ( #4992 )
...
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-10-31 10:30:03 +08:00
Yuanchen
ce777853ae
[feature] ColossalEval: Evaluation Pipeline for LLMs ( #4786 )
...
* Add ColossalEval
* Delete evaluate in Chat
---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
2023-09-24 23:14:11 +08:00