import streamlit as st
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline

st.title('Question-Answering NLU')

st.sidebar.title('Navigation')
menu = st.sidebar.radio("", options=["Introduction", "Parsing NLU data into SQuAD 2.0", "Training",
                                     "Evaluation"], index=0)


if menu == "Introduction":

    st.markdown('''

        Question Answering NLU (QANLU) is an approach that maps the NLU task into question answering, 
        leveraging pre-trained question-answering models to perform well on few-shot settings. Instead of 
        training an intent classifier or a slot tagger, for example, we can ask the model intent- and 
        slot-related questions in natural language: 
        
        ```
        Context : I'm looking for a cheap flight to Boston.
        
        Question: Is the user looking to book a flight?
        Answer  : Yes
        
        Question: Is the user asking about departure time?
        Answer  : No
        
        Question: What price is the user looking for?
        Answer  : cheap
        
        Question: Where is the user flying from?
        Answer  : (empty)
        ```
        
        Thus, by asking questions for each intent and slot in natural language, we can effectively construct an NLU hypothesis. For more details,
        please read the paper: 
        [Language model is all you need: Natural language understanding as question answering](https://assets.amazon.science/33/ea/800419b24a09876601d8ab99bfb9/language-model-is-all-you-need-natural-language-understanding-as-question-answering.pdf).
        
        In this Space, we will see how to transform an example
        NLU dataset (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/)
        question-answering data that can be used by QANLU.

    ''')

elif menu == "Parsing NLU data into SQuAD 2.0":
    st.header('QA-NLU Data Parsing')
    
    st.markdown('''
        Here, we show a small example of how NLU data can be transformed into QANLU data.
        The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis) 
        NLU data (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/)
        question-answering data that can be used by QANLU.
        
        Here is an example dataset with three intents and two examples per intent: 
        
        ````
        restaurant, I am looking for some Vietnamese food
        restaurant, What is there to eat around here?
        music, Play my workout playlist
        music, Can you find Bob Dylan songs?
        flight, Show me flights from Oakland to Dallas
        flight, I want two economy tickets from Miami to Chicago
        ````
        
        Now, we need to define some questions, per intent. We can use free-form questions or use templates.
        
        ````
        {
            'restaurant': [
                'Did they ask for a restaurant?',
                'Did they mention a restaurant?'
            ],
            'music': [
                'Did they ask for music?',
                'Do they want to play music?'
            ],
            'flight': [
                'Did they ask for a flight?',
                'Do they want to book a flight?'
            ]
        }
        ````
        
        The next step is to run the `atis.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu).
        That script will produce a json file that looks like this:
        
        ````
        {
        "version": 1.0,
        "data": [
            {
                "title": "MultiATIS++",
                "paragraphs": [
                    {
                        "context": "yes. no. i am looking for some vietnamese food",
                        "qas": [
                            {
                                "question": "did they ask for a restaurant?",
                                "id": "49f1180cb9ce4178a8a90f76c21f69b4",
                                "is_impossible": false,
                                "answers": [
                                    {
                                        "text": "yes",
                                        "answer_start": 0
                                    }
                                ],
                                "slot": "",
                                "intent": "restaurant"
                            },
                            {
                                "question": "did they ask for music?",
                                "id": "a7ffe039fb3e4843ae16d5a68194f45e",
                                "is_impossible": false,
                                "answers": [
                                    {
                                        "text": "no",
                                        "answer_start": 5
                                    }
                                ],
                                "slot": "",
                                "intent": "restaurant"
                            },
                            ... <More questions>
                            
                ... <More paragraphs>
        ````
        
        There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model.
        
    ''')
    
elif menu == "Training":
    st.header('QA-NLU Training')
    
    st.markdown('''
        To train a QA-NLU model on the data we created, we use the `run_squad.py` script from [huggingface](https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py) and a SQuAD-trained QA model as our base. As an example, we can use `deepset/roberta-base-squad2` model from [here](https://huggingface.co/deepset/roberta-base-squad2) (assuming 8 GPUs are present):
        
        ```
        mkdir models
        
        python -m torch.distributed.launch --nproc_per_node=8 run_squad.py \
            --model_type roberta \
            --model_name_or_path deepset/roberta-base-squad2 \
            --do_train \
            --do_eval \
            --do_lower_case \
            --train_file data/matis_en_train_squad.json \
            --predict_file data/matis_en_test_squad.json \
            --learning_rate 3e-5 \
            --num_train_epochs 2 \
            --max_seq_length 384 \
            --doc_stride 64 \
            --output_dir models/qanlu/ \
            --per_gpu_train_batch_size 8 \
            --overwrite_output_dir \
            --version_2_with_negative \
            --save_steps 100000 \
            --gradient_accumulation_steps 8 \
            --seed $RANDOM
        ```
    ''')

elif menu == "Evaluation":
    st.header('QA-NLU Evaluation')
    
    st.markdown('''
        To assess the performance of the trained model, we can use the `calculate_pr.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu).
        
        Feel free to query the pre-trained QA-NLU model using the buttons below.
    ''')
    
    tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu", use_auth_token=True)

    model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu", use_auth_token=True)

    qa_pipeline = pipeline('question-answering', model=model, tokenizer=tokenizer)
    
    context = st.text_input(
        'Please enter the context:',
        value="I want a cheap flight to Boston."
    )
    question = st.text_input(
        'Please enter the question:',
        value="What is the destination?"
    )


    qa_input = {
      'context': 'Yes. No. ' + context,
      'question': question
    }

    if st.button('Ask QANLU'):
        answer = qa_pipeline(qa_input)
        st.write(answer)