Intel
/

gpt-j-6B-pytorch-int8-static-inc

Text Generation

text-generation-inference

PostTrainingStatic

Intel® Neural Compressor

neural-compressor

Inference Endpoints

Model card Files Files and versions Community

Kaihui commited on Jan 18

Commit

2549f5d

•

1 Parent(s): 39ceb0b

Update README.md

Files changed (1) hide show

README.md +18 -1

README.md CHANGED Viewed

@@ -16,6 +16,23 @@ tags:
 - neural-compressor
 ---
 # INT8 GPT-J 6B
 GPT-J 6B is a transformer model trained using Ben Wang's [Mesh Transformer JAX](https://github.com/kingoflolz/mesh-transformer-jax/). "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.
-This int8 PyTorch model is generated by [neural-compressor](https://github.com/intel/neural-compressor).

 - neural-compressor
 ---
 # INT8 GPT-J 6B
+## Model Description
 GPT-J 6B is a transformer model trained using Ben Wang's [Mesh Transformer JAX](https://github.com/kingoflolz/mesh-transformer-jax/). "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.
+This int8 PyTorch model is generated by [intel-extension-for-transformers](https://github.com/intel/intel-extension-for-transformers).
+| Package       | Version      |
+|----------------------|------------|
+| intel-extension-for-transformers| a4aba8ddb07c9b744b6ac106502ec059e0c47960  |
+| neural-compressor     | 2.4.1    |
+| torch      | 2.1.0+cpu       |
+| intel-extension-for-pytorch       | 2.1.0      |
+| transformers    | 4.32.0         |
+## Evaluation results
+Evaluating the accuracy of the optimized model of gpt-j-6b using the lambada_openai dataset in lm_eval.
+| Dtype | Dataset | Precision |
+|------  |--------|--------|
+| FP32    |Lambada_openai     | 0.6831 |
+| INT8    |Lambada_openai     | 0.6835 |