roberta-base-qnli-finetuned
This model is a fine-tuned version of roberta-base on the QNLI-data It achieves the following results on the evaluation set:
- Loss: 0.2133
- Accuracy: 0.9176
Model description
This is a finetuned version of FacebookAI/roberta-base, it has been finetuned on the QNLI dataset, which contains "Question-Sentence" pairs, and labels them if they are an entailment of the question or not.
Intended uses & limitations
This model is intended to be used with similar dataset like the qnli-dataset, or it can be easily finetuned to another downstream task. This model contains no limitations for use, anyone can use it.
Training and evaluation data
The dataset we used was Qnli-dataset,
information about dataset: The Stanford Question Answering Dataset is a question-answering dataset consisting of question-paragraph pairs,
where one of the sentences in the paragraph (drawn from Wikipedia) contains the answer to the corresponding question (written by an annotator).
The authors of the benchmark convert the task into sentence pair classification by forming a pair between each question and each sentence in the corresponding context,
and filtering out pairs with low lexical overlap between the question and the context sentence. The task is to determine whether the context sentence contains the answer
to the question. This modified version of the original task removes the requirement that the model select the exact answer, but also removes the simplifying
assumptions that the answer is always present in the input and that lexical overlap is a reliable cue. source: here
- Training dataset: The training split of QNLI data was used to train the finetuned version of roberta-base model, the training sample contains about 105,000 entries.
- Evaluation dataset: The validation split of Qnli dataset was used to evaluate the performance of
roberta-base-qnli-finetuned
, evaluation split contains about 5460 rows of entry.
Training procedure
The model was finetuned on a colab-environment
, with GPU: T4 selected as the GPU of choice. The dataset was first tokenized with an appropriate tokenizer
(roberta's tokenizer), The training arguments are specified in the Training-Hyperparameters
section.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-06
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.3191 | 0.9995 | 1636 | 0.2405 | 0.9023 |
0.2739 | 1.9997 | 3273 | 0.2214 | 0.9109 |
0.2467 | 2.9998 | 4910 | 0.2115 | 0.9180 |
0.231 | 3.9982 | 6544 | 0.2133 | 0.9176 |
Framework versions
- Transformers 4.42.4
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 8
Model tree for sid29/roberta-base-qnli-finetuned
Base model
FacebookAI/roberta-base