keyfan commited on
Commit
628abf2
1 Parent(s): 8eead03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -20,5 +20,9 @@ Measured at Wikitext with 4096 context length
20
  | 5.8438 | 6.9492 |
21
 
22
  ## Speed
23
- Measured with `examples/benchmark_latency.py` script at vLLM repo.
24
- At batch size = 1, it generates at 13.5 tokens/s with single A100.
 
 
 
 
 
20
  | 5.8438 | 6.9492 |
21
 
22
  ## Speed
23
+
24
+ Latency and throughput are measured using vLLM (`examples/benchmark_latency.py` and `examples/benchmark_throughput.py` respectively) at single A100-80G.
25
+
26
+ Latency at batch size 1: 13.5 tokens/s.
27
+
28
+ Throughput: 0.77 requests/s