MediaTek-Research
/

Breeze-7B-Instruct-v0_1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

cllatMTK commited on Jan 11

Commit

65792a4

•

1 Parent(s): 275cb85

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -86,7 +86,7 @@ and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.
 ## Inference Performance
 In this test, we use the first 700 characters of the [web article](https://health.udn.com/health/story/5976/7699252?from=udn_ch1005_main_index) as the input and ask the model to write the same article again.
-All models were inferenced with `vllm` on 2 NVIDIA RTX A6000  (TP=2).
 | Models                                                             | Inference Time (sec)|Estimated Max Input Length (Char)|
 |--------------------------------------------------------------------|-------------------|--------------------------|

 ## Inference Performance
 In this test, we use the first 700 characters of the [web article](https://health.udn.com/health/story/5976/7699252?from=udn_ch1005_main_index) as the input and ask the model to write the same article again.
+All inferences run on 2 RTX A6000 GPUs (using `vllm`, with a tensor-parallel size of 2).
 | Models                                                             | Inference Time (sec)|Estimated Max Input Length (Char)|
 |--------------------------------------------------------------------|-------------------|--------------------------|