xinhe commited on
Commit
e1c59c8
1 Parent(s): 7125487

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -8
README.md CHANGED
@@ -38,12 +38,8 @@ The linear modules **bert.encoder.layer.2.output.dense, bert.encoder.layer.5.int
38
 
39
  ### Test result
40
 
41
- - Batch size = 8
42
- - [Amazon Web Services](https://aws.amazon.com/) c6i.xlarge (Intel ICE Lake: 4 vCPUs, 8g Memory) instance.
43
-
44
  | |INT8|FP32|
45
  |---|:---:|:---:|
46
- | **Throughput (samples/sec)** |16.55|9.333|
47
  | **Accuracy (eval-accuracy)** |0.7838|0.7915|
48
  | **Model size (MB)** |133|418|
49
 
@@ -55,7 +51,3 @@ int8_model = OptimizedModel.from_pretrained(
55
  'Intel/bert-base-uncased-finetuned-swag-int8-static',
56
  )
57
  ```
58
-
59
- Notes:
60
- - The INT8 model has better performance than the FP32 model when the CPU is fully occupied. Otherwise, there will be the illusion that INT8 is inferior to FP32.
61
-
 
38
 
39
  ### Test result
40
 
 
 
 
41
  | |INT8|FP32|
42
  |---|:---:|:---:|
 
43
  | **Accuracy (eval-accuracy)** |0.7838|0.7915|
44
  | **Model size (MB)** |133|418|
45
 
 
51
  'Intel/bert-base-uncased-finetuned-swag-int8-static',
52
  )
53
  ```