Official repository: https://github.com/gonglinyuan/metro_t0

METRO-T0

Paper: Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers (ACL 2023)

METRO-T0 is a T5-style text-to-text Transformer pretrained using model-generated pretraining signals, prompt-finetuned on a family of public NLP tasks proposed in T0. METRO-T0 is highly parameter efficient. For example, METRO-T0-Large++ (775M parameters) outperforms GPT-3 (175B parameters) and T0-3B (3B parameters) on a wide range of NLP tasks.

Use METRO-T0+-Large++

To use METRO-T0+-Large++ in PyTorch (Python 3.7+, PyTorch 1.12+ and transformers 4.17+ are prerequisites), refer to the code snippet below:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model = AutoModelForSeq2SeqLM.from_pretrained("gonglinyuan/metro_t0p_largepp", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("gonglinyuan/metro_t0p_largepp", trust_remote_code=True)

input_text = "Is this review positive or negative? Review: this is the best cast iron skillet you will ever buy"
inputs = tokenizer([input_text], max_length=512, truncation=True, add_special_tokens=True, return_tensors="pt").input_ids
outputs = model.generate(inputs, max_new_tokens=256, do_sample=False)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))  # expected: positive

Other METRO-T0 Models

	# Parameters	Pretraining Data	Prompt-Finetuning Data
METRO-T0-Base	226M	Wikibook (16G)	T0 Train
METRO-T0+-Base	226M	Wikibook (16G)	T0+ Train
METRO-T0++-Base	226M	Wikibook (16G)	T0++ Train
METRO-T0-Base++	256M	160G corpus	T0 Train
METRO-T0+-Base++	256M	160G corpus	T0+ Train
METRO-T0++-Base++	256M	160G corpus	T0++ Train
METRO-T0-Large++	775M	160G corpus	T0 Train
METRO-T0+-Large++	775M	160G corpus	T0+ Train
METRO-T0++-Large++	775M	160G corpus	T0++ Train

Citation

If you find the code and models useful for your research, please cite the following paper:

@misc{gong2023modelgenerated,
      title={Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers}, 
      author={Linyuan Gong and Chenyan Xiong and Xiaodong Liu and Payal Bajaj and Yiqing Xie and Alvin Cheung and Jianfeng Gao and Xia Song},
      year={2023},
      eprint={2305.12567},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2305.12567}
}

gonglinyuan
/

metro_t0p_largepp

METRO-T0

Use METRO-T0+-Large++

Other METRO-T0 Models

Citation

Evaluation results