system HF staff commited on
Commit
0a064fd
1 Parent(s): a9e8e94

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Turkish QNLI Model
3
+
4
+ I fine-tuned Turkish-Bert-Model for Question-Answering problem with Turkish version of SQuAD; TQuAD
5
+ https://huggingface.co/dbmdz/bert-base-turkish-uncased
6
+
7
+ # Data: TQuAD
8
+ I used following TQuAD data set
9
+
10
+ https://github.com/TQuad/turkish-nlp-qa-dataset
11
+
12
+ I convert the dataset into transformers glue data format of QNLI by the following script
13
+ SQuAD -> QNLI
14
+
15
+ ```
16
+ import argparse
17
+ import collections
18
+ import json
19
+ import numpy as np
20
+ import os
21
+ import re
22
+ import string
23
+ import sys
24
+
25
+ ff="dev-v0.1.json"
26
+ ff="train-v0.1.json"
27
+ dataset=json.load(open(ff))
28
+
29
+ i=0
30
+ for article in dataset['data']:
31
+ title= article['title']
32
+ for p in article['paragraphs']:
33
+ context= p['context']
34
+ for qa in p['qas']:
35
+ answer= qa['answers'][0]['text']
36
+ all_other_answers= list(set([e['answers'][0]['text'] for e in p['qas']]))
37
+ all_other_answers.remove(answer)
38
+ i=i+1
39
+ print(i,qa['question'].replace(";",":") , answer.replace(";",":"),"entailment", sep="\t")
40
+ for other in all_other_answers:
41
+ i=i+1
42
+ print(i,qa['question'].replace(";",":") , other.replace(";",":"),"not_entailment" ,sep="\t")
43
+
44
+ ```
45
+
46
+
47
+ Under QNLI folder there are dev and test test
48
+ Training data looks like
49
+ > 613 II.Friedrich’in bilginler arasındaki en önemli şahsiyet olarak belirttiği kişi kimdir? filozof, kimyacı, astrolog ve çevirmen not_entailment
50
+ > 614 II.Friedrich’in bilginler arasındaki en önemli şahsiyet olarak belirttiği kişi kimdir? kişisel eğilimi ve özel temaslar nedeniyle not_entailment
51
+ > 615 Michael Scotus’un mesleği nedir? filozof, kimyacı, astrolog ve çevirmen entailment
52
+ > 616 Michael Scotus’un mesleği nedir? Palermo’ya not_entailment
53
+
54
+
55
+
56
+
57
+
58
+ # Training
59
+
60
+ Training the model with following environment
61
+ ```
62
+ export GLUE_DIR=./glue/glue_dataTR/QNLI
63
+ export TASK_NAME=QNLI
64
+ ```
65
+
66
+ ```
67
+ python3 run_glue.py \
68
+ --model_type bert \
69
+ --model_name_or_path dbmdz/bert-base-turkish-uncased\
70
+ --task_name $TASK_NAME \
71
+ --do_train \
72
+ --do_eval \
73
+ --data_dir $GLUE_DIR \
74
+ --max_seq_length 128 \
75
+ --per_gpu_train_batch_size 32 \
76
+ --learning_rate 2e-5 \
77
+ --num_train_epochs 3.0 \
78
+ --output_dir /tmp/$TASK_NAME/
79
+
80
+ ```
81
+
82
+
83
+ # Evaluation Results
84
+
85
+ ==
86
+ | acc | 0.9124060613527165
87
+ | loss| 0.21582801340189717
88
+ ==
89
+
90
+ > See all my model
91
+ > https://huggingface.co/savasy
92
+
93
+
94
+
95
+
96
+
97
+