File size: 3,791 Bytes
1a70301 e4aa671 17ce64f 1a70301 df23132 1a70301 e0fd167 df23132 4b36835 df23132 1a70301 6460086 8039fa9 9a64508 88cd7a4 17ce64f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
language:
- en
datasets:
- Reza8848/MUFFIN_68k
license: mit
---
<img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/J_4FHXmtM6TuRnN3aL06y.png" width="38" height="38">
This is the model weight of **MUFFIN-T5-3B** (**Mu**lti-**F**aceted **In**structions).
We fine-tune the [T5-3B](https://huggingface.co/t5-3b) model on our [MUFFIN dataset](https://arxiv.org/abs/2312.02436).
We released both 3B and 11B models:
|Model|Number of parameters|
|-|-|
|[MUFFIN-T5-3B](https://huggingface.co/Reza8848/MUFFIN-T5-3B)|3 billion|
|[MUFFIN-T5-11B](https://huggingface.co/Reza8848/MUFFIN-T5-11B)|11 billion|
You can also find the Llama2-based model weights [here](https://huggingface.co/Reza8848/MUFFIN-Llama2-lora-13B).
## Prompt Template
Please use the following prompt template when using the models for inference (including the evaluations on **SuperNI-Test**, **T0-Eval**, and **BBH**):
```python
prompt = "### Input:\n{input}"
prompt += "\n\n"
prompt += "### Instruction:\n{instruction}"
prompt += "\n\n"
prompt += "### Output:\n"
print(prompt)
```
Please use the below prompt when testing the models on classification tasks (i.e., the **MMLU**).
```python
prompt = "### Input:\n{input}"
prompt += "\n\n"
prompt += "### Instruction:\n{instruction}\n"
prompt += "(A): {option1}\n(B): {option2}\n(C): {option3}\n(D): {option4}\nAvoid answers outside of (A, B, C, D)." # Add one more sentence in the prompt to indicate the output spaces
prompt += "\n\n"
prompt += "### Output:\n"
print(prompt)
```
## Model Usage
Download our model weights through HuggingFace transformers 🤗:
```python
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
## Download
tokenizer = AutoTokenizer.from_pretrained("Reza8848/MUFFIN-T5-3B")
model = AutoModelForSeq2SeqLM.from_pretrained("Reza8848/MUFFIN-T5-3B")
## Inference
#### Please prepare your testing instance (as shown below)
value_dict = {
"input": "Drink more wine when you feel thirsty.\nDrink more water when you feel thirsty"
"instruction": "In this task, you are given two unconventional instructions for quenching thirst. Your goal is to identify which instruction is more likely to be followed by a person who wants to try something new or different. Answer \"Wine\" if the person is more likely to drink wine when thirsty, and \"Water\" if they are more likely to drink water."
}
#### Please use the prompt template mentioned before
input_sequence = prompt.format_map(value_dict)
input_ids = tokenizer(input_sequence, return_tensors="pt").input_ids
raw_outputs = model.generate(input_ids) # set the generation arguments according to your needs (e.g., `do_sample`, `num_beams`)
outputs = tokenizer.decode(raw_outputs[0], skip_special_tokens=True)
print(outputs)
```
## Zero-Shot Evaluation Performances
Our training and inference code is based on [Tk-Instruct](https://github.com/yizhongw/Tk-Instruct/tree/main), including the [metric calculation scripts](https://github.com/yizhongw/Tk-Instruct/blob/main/src/compute_metrics.py) (i.e., `ROUGE-L` and `Exact-Match`).
<div style="text-align:center"><img src="https://cdn-uploads.huggingface.co/production/uploads/6434a6e8ea46c009904c617e/J1ZMmCs6GvrRKyD1hazTs.png" alt="performances.png" width="600"/></div>
## 🥳 Citation
Please kindly cite our paper if you use any resources in this repository:
```bibtex
@inproceedings{Lou2023MUFFIN,
title={{MUFFIN}: Curating Multi-Faceted Instructions for Improving Instruction Following},
author={Renze Lou and Kai Zhang and Jian Xie and Yuxuan Sun and Janice Ahn and Hanzi Xu and Yu su and Wenpeng Yin},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=1vrS1zwekw}
}
``` |