dahara1
/

llama3.1-8b-Instruct-awq

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

dahara1 commited on Jul 26

Commit

05f743f

•

1 Parent(s): 83ca26f

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -4,12 +4,15 @@ language:
 - ja
 ---
 ```
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, AwqConfig
 model_id = "dahara1/llama3.1-8b-Instruct-awq"
-# Perplexity: 9.864517211914062
 quantization_config = AwqConfig(
     bits=4,

 - ja
 ---
+量子化時に日本語と中国語を多めに追加しているため、[hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4)より日本語データを使って計測したPerplexityが良い事がわかっています
 ```
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, AwqConfig
 model_id = "dahara1/llama3.1-8b-Instruct-awq"
 quantization_config = AwqConfig(
     bits=4,