Python 調用模型的方法

by hflin0613 - opened May 6

May 6

•

本地機、colab(T4-GPU)都嘗試執行過以下程式，但是都運行不出結果(不斷運行、沒有任何錯誤異常)，不知道是否哪邊出錯?

import torch
from huggingface_hub import login
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

login(
    token="HF_TOKEN",
    add_to_git_credential=True
)

model_id = "taide/Llama3-TAIDE-LX-8B-Chat-Alpha1"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)



@torch

	.inference_mode()
def get_completion(prompt: str) -> str:
    response = pipe(
        prompt,
        max_new_tokens=256,
        do_sample=True, 
        temperature=0.1, 
        top_k=50, 
        top_p=0.1, 
        eos_token_id=pipe.tokenizer.eos_token_id, 
        pad_token_id=pipe.tokenizer.pad_token_id
    )
    return response

response = get_completion("你好")
print(response)

nctu6

TAIDE org May 6

•

edited May 6

您好，

請參考
[1] https://huggingface.co/docs/hub/security-tokens
[2] https://huggingface.co/settings/tokens

在 settings 的 access tokens tab [2] 裡面，會有您自己的 token （是一串 string，不是 "HF_TOKEN"），若沒有，請新增一個。

假設 token 為 hf_Cmxxxxxyyyyzzzzztaidetaidetaidetaide， login() 傳入正確的 token，如下。

my_token = "hf_Cmxxxxxyyyyzzzzztaidetaidetaidetaide"
login(
    token=my_token,
    add_to_git_credential=True
)

Best regards.

hflin0613

May 7

感謝回覆!

我用的 Token 是 hf_xxxxx 沒錯，執行的時候也有正確 login、load model。
只是在執行 get_completion() 的時候，會不斷的運行，像是無限迴圈一樣，沒有任何報錯。

nctu6

TAIDE org May 7

您好，

測試了程式碼，只換掉 token，正常印出結果。

可以參考以下章節，打開 DEBUG LOG 試試：
https://huggingface.co/docs/transformers/main_classes/logging

Best regards.

hflin0613

May 7

再次感謝回覆!

目前推測是硬體設備的原因，後面改用 4-bit 量化模型可以正常執行。

aqweteddy changed discussion status to closed May 8

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment