Edit model card

license: apache-2.0

GreenBit Yi

This is GreenBitAI's pretrained 4-bit Yi 34B model with extreme compression yet still strong performance.

Please refer to our Github page for the code to run the model and more information.

Model Description

Few Shot Evaluation (officially evaluated by 01-Yi)

Model Yi-34B FP16 Yi-34B 4 bit Yi-6B FP16 Yi-6B 4 bit
GroupSize - 32 - 32
Model Size (GB) 68.79 19.89 12.12 4.04
AVG 70.64 69.7 60.11 59.14
Detailed Evaluation
MMLU 76.32 75.42 63.24 62.09
CMMLU 83.65 83.07 75.53 72.85
ARC-e 84.42 84.13 77.23 76.52
ARC-c 61.77 59.56 50.34 48.47
GAOKAO 82.8 81.37 72.2 72.87
GSM8K 67.24 63.61 32.52 28.05
HumanEval 25.6 25 15.85 15.85
BBH 54.3 52.3 42.8 41.47
WinoGrande 78.68 78.53 70.63 71.19
PIQA 82.86 82.75 78.56 79.05
SIQA 74.46 73.44 64.53 64.53
HellaSwag 83.64 83.02 74.91 73.27
OBQA 91.6 90.8 85.4 82.6
CSQA 83.37 83.05 76.9 75.43
TriviaQA 81.52 80.73 64.85 61.75
SquAD 92.46 91.12 88.95 88.39
BoolQ 88.25 88.17 76.23 77.1
MBPP 41 39.68 26.32 25.13
QUAC 48.61 47.43 40.92 40.16
Lambda 73.18 73.39 67.74 67.8
NaturalQuestion 27.67 27.21 16.69 17.42

Zero Shot Evaluation

Task Metric Yi-6B FP16 Yi-6B 4 bit Yi-34B 4 bit
Openbookqa acc 0.314 0.324 0.344
ac_norm 0.408 0.42 0.474
arc_challenge acc 0.462 0.4573 0.569
ac_norm 0.504 0.483 0.5964
hellawswag acc 0.553 0.5447 0.628
ac_norm 0.749 0.7327 0.83
piqa acc 0.777 0.7709 0.8079
ac_norm 0.787 0.7894 0.828
arc_easy acc 0.777 0.7697 0.835
ac_norm 0.774 0.7659 0.84
Winogrande acc 0.707 0.7095 0.7853
boolq acc 0.755 0.7648 0.886
truthfulqa_mc mc1 0.29 0.2729 0.4026
mc2 0.419 0.4033 0.5528
anli_r1 acc 0.423 0.416 0.554
anli_r2 acc 0.409 0.409 0.518
anli_r3 acc 0.411 0.393 0.4983
wic acc 0.529 0.545 0.5376
rte acc 0.685 0.7039 0.7617
record f1 0.904 0.9011 0.924
em 0.8962 0.8927 0.916
Average 0.596 0.5937 0.6708
Downloads last month
15
Safetensors
Model size
6.28B params
Tensor type
I32
FP16
U8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.