Update README.md
Browse files
README.md
CHANGED
@@ -86,7 +86,7 @@ and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.
|
|
86 |
|
87 |
## Inference Performance
|
88 |
In this test, we use the first 700 characters of the [web article](https://health.udn.com/health/story/5976/7699252?from=udn_ch1005_main_index) as the input and ask the model to write the same article again.
|
89 |
-
All
|
90 |
|
91 |
| Models | Inference Time (sec)|Estimated Max Input Length (Char)|
|
92 |
|--------------------------------------------------------------------|-------------------|--------------------------|
|
|
|
86 |
|
87 |
## Inference Performance
|
88 |
In this test, we use the first 700 characters of the [web article](https://health.udn.com/health/story/5976/7699252?from=udn_ch1005_main_index) as the input and ask the model to write the same article again.
|
89 |
+
All inferences run on 2 RTX A6000 GPUs (using `vllm`, with a tensor-parallel size of 2).
|
90 |
|
91 |
| Models | Inference Time (sec)|Estimated Max Input Length (Char)|
|
92 |
|--------------------------------------------------------------------|-------------------|--------------------------|
|