Spaces:
Runtime error
Runtime error
File size: 8,328 Bytes
2f088d9 e4379f8 2f088d9 9ca293b cd270a3 9ca293b b735079 269b7cd c544217 b735079 9158fc8 269b7cd 9158fc8 269b7cd 9158fc8 24f7c24 9158fc8 d7a4684 3bda9fd d7a4684 a896ab1 24f7c24 a896ab1 9158fc8 24f7c24 c7cc7c1 24f7c24 e32766f 5e5942a 24f7c24 e4379f8 24f7c24 269b7cd 24f7c24 269b7cd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 |
import streamlit as st
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline
st.title('Question-Answering NLU')
st.sidebar.title('Navigation')
menu = st.sidebar.radio("", options=["Introduction", "Parsing NLU data into SQuAD 2.0", "Training",
"Evaluation"], index=0)
if menu == "Demo":
st.markdown('''
Question Answering NLU (QANLU) is an approach that maps the NLU task into question answering,
leveraging pre-trained question-answering models to perform well on few-shot settings. Instead of
training an intent classifier or a slot tagger, for example, we can ask the model intent- and
slot-related questions in natural language:
```
Context : I'm looking for a cheap flight to Boston.
Question: Is the user looking to book a flight?
Answer : Yes
Question: Is the user asking about departure time?
Answer : No
Question: What price is the user looking for?
Answer : cheap
Question: Where is the user flying from?
Answer : (empty)
```
Thus, by asking questions for each intent and slot in natural language, we can effectively construct an NLU hypothesis. For more details,
please read the paper:
[Language model is all you need: Natural language understanding as question answering](https://assets.amazon.science/33/ea/800419b24a09876601d8ab99bfb9/language-model-is-all-you-need-natural-language-understanding-as-question-answering.pdf).
In this Space, we will see how to transform an example
NLU dataset (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/)
question-answering data that can be used by QANLU.
Feel free to query the pre-trained QA-NLU model using the buttons below.
*Please note that this model has been trained on ATIS and may be need to be further fine-tuned to support intents and slots that are not covered in ATIS*.
''')
tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu")
model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu")
qa_pipeline = pipeline('question-answering', model=model, tokenizer=tokenizer)
context = st.text_input(
'Please enter the context (remember to include "Yes. No. " in the beginning):',
value="Yes. No. I want a cheap flight to Boston."
)
question = st.text_input(
'Please enter the intent question:',
value="Are they looking for a flight?"
)
qa_input = {
'context': context,
'question': question
}
if st.button('Ask QANLU'):
answer = qa_pipeline(qa_input)
st.write(answer)
elif menu == "Parsing NLU data into SQuAD 2.0":
st.header('QA-NLU Data Parsing')
st.markdown('''
Here, we show a small example of how NLU data can be transformed into QANLU data.
The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis)
NLU data (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/)
question-answering data that can be used by QANLU.
Here is an example dataset with three intents and two examples per intent:
````
restaurant, I am looking for some Vietnamese food
restaurant, What is there to eat around here?
music, Play my workout playlist
music, Can you find Bob Dylan songs?
flight, Show me flights from Oakland to Dallas
flight, I want two economy tickets from Miami to Chicago
````
Now, we need to define some questions, per intent. We can use free-form questions or use templates.
````
{
'restaurant': [
'Did they ask for a restaurant?',
'Did they mention a restaurant?'
],
'music': [
'Did they ask for music?',
'Do they want to play music?'
],
'flight': [
'Did they ask for a flight?',
'Do they want to book a flight?'
]
}
````
The next step is to run the `atis.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu).
That script will produce a json file that looks like this:
````
{
"version": 1.0,
"data": [
{
"title": "MultiATIS++",
"paragraphs": [
{
"context": "yes. no. i am looking for some vietnamese food",
"qas": [
{
"question": "did they ask for a restaurant?",
"id": "49f1180cb9ce4178a8a90f76c21f69b4",
"is_impossible": false,
"answers": [
{
"text": "yes",
"answer_start": 0
}
],
"slot": "",
"intent": "restaurant"
},
{
"question": "did they ask for music?",
"id": "a7ffe039fb3e4843ae16d5a68194f45e",
"is_impossible": false,
"answers": [
{
"text": "no",
"answer_start": 5
}
],
"slot": "",
"intent": "restaurant"
},
... <More questions>
... <More paragraphs>
````
There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model.
''')
elif menu == "Training":
st.header('QA-NLU Training')
st.markdown('''
To train a QA-NLU model on the data we created, we use the `run_squad.py` script from [huggingface](https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py) and a SQuAD-trained QA model as our base. As an example, we can use `deepset/roberta-base-squad2` model from [here](https://huggingface.co/deepset/roberta-base-squad2) (assuming 8 GPUs are present):
''')
st.code('''
mkdir models
python -m torch.distributed.launch --nproc_per_node=8 run_squad.py \\
--model_type roberta \\
--model_name_or_path deepset/roberta-base-squad2 \\
--do_train \\
--do_eval \\
--do_lower_case \\
--train_file data/matis_en_train_squad.json \\
--predict_file data/matis_en_test_squad.json \\
--learning_rate 3e-5 \\
--num_train_epochs 2 \\
--max_seq_length 384 \\
--doc_stride 64 \\
--output_dir models/qanlu/ \\
--per_gpu_train_batch_size 8 \\
--overwrite_output_dir \\
--version_2_with_negative \\
--save_steps 100000 \\
--gradient_accumulation_steps 8 \\
--seed $RANDOM
''')
elif menu == "Evaluation":
st.header('QA-NLU Evaluation')
st.markdown('''
To assess the performance of the trained model, we can use the `calculate_pr.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu).
Feel free to query the pre-trained QA-NLU model in the Demo section.
''')
|