模型介绍
- 使用模型:t5-3b
- 使用数据:wmt16 ro-en(共数据610320,使用了其中2000)
- 使用分布式工具:deepspeed
- 显卡:一张RTX 4090,24G
- 目标:训练模型在ro-en上的翻译能力
使用方法
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
t5_qa_model = AutoModelForSeq2SeqLM.from_pretrained("snowfly/t5-3b-wmt16-ro-en")
t5_tok = AutoTokenizer.from_pretrained("snowfly/t5-3b-wmt16-ro-en")
input_ids = t5_tok("When was Franklin D. Roosevelt born?",
return_tensors="pt").input_ids
gen_output = t5_qa_model.generate(input_ids)[0]
print(t5_tok.decode(gen_output, skip_special_tokens=True))
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.