Spaces:

Somunia
/

cpu-casuallm

Running

Somunia commited on Sep 3

Commit

dc779ad

•

1 Parent(s): 8bc6b74

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -30,9 +30,6 @@ model = AutoModelForCausalLM.from_pretrained(
     # use_flash_attention_2=False
 ).to(torch.float32)
-model = model.quantize(8)  # Quantize to int8 (experiment with different values)
-model = model.to("cpu")
 # Create a custom tokenizer (make sure to download vocab.json)
 tokenizer = AutoTokenizer.from_pretrained(
     model_path,

     # use_flash_attention_2=False
 ).to(torch.float32)
 # Create a custom tokenizer (make sure to download vocab.json)
 tokenizer = AutoTokenizer.from_pretrained(
     model_path,