ivkalgin commited on
Commit
23da18b
1 Parent(s): d78be47

Create README.md (#1)

Browse files

- Create README.md (ed994783d01629e1c3b167331a8093b1e759ba0b)
- fixed typos, updated model size values (ac04cb38d8d6e6aad97dec51a3a6c6d09c83daaa)

Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - lambada
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - text-generation-inference
11
+ - causal-lm
12
+ - int8
13
+ - tensorrt
14
+ - ENOT-AutoDL
15
+ ---
16
+
17
+ # INT8 GPT-J 6B
18
+
19
+ GPT-J 6B is a transformer model trained using Ben Wang's [Mesh Transformer JAX](https://github.com/kingoflolz/mesh-transformer-jax/). "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.
20
+
21
+ This repository contains TensorRT engines with mixed precission int8 + fp32. You can find prebuilt engines for next GPUs:
22
+ * RTX 4090
23
+ * RTX 3080 Ti
24
+ * RTX 2080 Ti
25
+
26
+ ONNX model generated by [ENOT-AutoDL](https://pypi.org/project/enot-autodl/) and will be published soon.
27
+
28
+ ## Test result
29
+
30
+ | |INT8|FP32|
31
+ |---|:---:|:---:|
32
+ | **Lambada Acc** |78.50%|79.54%|
33
+ | **Model size (GB)** |8.5|24.2|
34
+
35
+
36
+ ## How to use
37
+
38
+ Example of inference and accuracy test published on github:
39
+ ```shell
40
+ git clone https://github.com/ENOT-AutoDL/demo-gpt-j-6B-tensorrt-int8
41
+ ```