fakezeta
/

llama-3-8b-instruct-ov-int8

Text Generation

Inference Endpoints

Model card Files Files and versions Community

fakezeta commited on Apr 28

Commit

77bf2b8

•

1 Parent(s): 4f9f9da

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -15,6 +15,21 @@ license_link: https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENS
 # OpenVINO IR model with int8 quantization of llama-3-8B-Instruct
 ## Model Details
 Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.

 # OpenVINO IR model with int8 quantization of llama-3-8B-Instruct
+Model definition for LocalAI:
+```
+name: llama3
+backend: transformers
+parameters:
+  model: fakezeta/llama-3-8b-instruct-ov-int8
+context_size: 8192
+type: OVModelForCausalLM
+template:
+  use_tokenizer_template: true
+stopwords:
+- "<|eot_id|>"
+- "<|end_of_text|>"
+```
 ## Model Details
 Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.