Edit model card

模型介绍

  • 使用模型:t5-3b
  • 使用数据:wmt16 ro-en(共数据610320,使用了其中2000)
  • 使用分布式工具:deepspeed
  • 显卡:一张RTX 4090,24G
  • 目标:训练模型在ro-en上的翻译能力

使用方法

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

t5_qa_model = AutoModelForSeq2SeqLM.from_pretrained("snowfly/t5-3b-wmt16-ro-en")
t5_tok = AutoTokenizer.from_pretrained("snowfly/t5-3b-wmt16-ro-en")

input_ids = t5_tok("When was Franklin D. Roosevelt born?",
                   return_tensors="pt").input_ids
gen_output = t5_qa_model.generate(input_ids)[0]

print(t5_tok.decode(gen_output, skip_special_tokens=True))
Downloads last month
4
Safetensors
Model size
2.95B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train snowfly/t5-3b-wmt16-ro-en