ilos-vigil commited on
Commit
e159f47
1 Parent(s): b93f6a6

Add README.md, model weight and Tensorboard log

Browse files
README.md CHANGED
@@ -21,4 +21,159 @@ widget:
21
 
22
  # Indonesian small BigBird model NLI
23
 
24
- This commit contain model weight from epoch 6 which has lowest loss/highest accuracy.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  # Indonesian small BigBird model NLI
23
 
24
+ ## Source Code
25
+
26
+ Source code to create this model and perform benchmark is available at [https://github.com/ilos-vigil/bigbird-small-indonesian](https://github.com/ilos-vigil/bigbird-small-indonesian).
27
+
28
+ ## Model Description
29
+
30
+ This model is based on [bigbird-small-indonesian](https://huggingface.co/ilos-vigil/bigbird-small-indonesian) and was finetuned on 2 datasets. It is intended to be used for zero-shot text classification.
31
+
32
+ ## How to use
33
+
34
+ > Inference for ZSC (Zero Shot Classification) task
35
+
36
+ ```py
37
+ >>> pipe = pipeline(
38
+ ... task='zero-shot-classification',
39
+ ... model='./tmp/checkpoint-28832'
40
+ ... )
41
+ >>> pipe(
42
+ ... sequences='Fakta nomor 7 akan membuat ada terkejut',
43
+ ... candidate_labels=['clickbait', 'bukan clickbait'],
44
+ ... hypothesis_template='Judul video ini {}.',
45
+ ... multi_label=False
46
+ ... )
47
+ {
48
+ 'sequence': 'Fakta nomor 7 akan membuat ada terkejut',
49
+ 'labels': ['clickbait', 'bukan clickbait'],
50
+ 'scores': [0.6102734804153442, 0.38972654938697815]
51
+ }
52
+ >>> pipe(
53
+ ... sequences='Samsung tuntut balik Apple dengan alasan hak paten teknologi.',
54
+ ... candidate_labels=['teknologi', 'olahraga', 'bisnis', 'politik', 'kesehatan', 'kuliner'],
55
+ ... hypothesis_template='Kategori berita ini adalah {}.',
56
+ ... multi_label=True
57
+ ... )
58
+ {
59
+ 'sequence': 'Samsung tuntut balik Apple dengan alasan hak paten teknologi.',
60
+ 'labels': ['politik', 'teknologi', 'kesehatan', 'bisnis', 'olahraga', 'kuliner'],
61
+ 'scores': [0.7390161752700806, 0.6657379269599915, 0.4459509551525116, 0.38407933712005615, 0.3679264783859253, 0.14181996881961823]
62
+ }
63
+ ```
64
+
65
+ > Inference for NLI (Natural Language Inference) task
66
+
67
+ ```py
68
+ >>> pipe = pipeline(
69
+ ... task='text-classification',
70
+ ... model='./tmp/checkpoint-28832',
71
+ ... return_all_scores=True
72
+ ... )
73
+ >>> pipe({
74
+ ... 'text': 'Nasi adalah makanan pokok.', # Premise
75
+ ... 'text_pair': 'Saya mau makan nasi goreng.' # Hypothesis
76
+ ... })
77
+ [
78
+ {'label': 'entailment', 'score': 0.25495028495788574},
79
+ {'label': 'neutral', 'score': 0.40920916199684143},
80
+ {'label': 'contradiction', 'score': 0.33584052324295044}
81
+ ]
82
+ >>> pipe({
83
+ ... 'text': 'Python sering digunakan untuk web development dan AI research.',
84
+ ... 'text_pair': 'AI research biasanya tidak menggunakan bahasa pemrograman Python.'
85
+ ... })
86
+ [
87
+ {'label': 'entailment', 'score': 0.12508109211921692},
88
+ {'label': 'neutral', 'score': 0.22146646678447723},
89
+ {'label': 'contradiction', 'score': 0.653452455997467}
90
+ ]
91
+ ```
92
+
93
+ ## Limitation and bias
94
+
95
+ This model inherit limitation/bias from it's parent model and 2 datasets used for fine-tuning. And just like most language model, this model is sensitive towards input change. Here's an example.
96
+
97
+ ```py
98
+ >>> from transformers import pipeline
99
+ >>> pipe = pipeline(
100
+ ... task='zero-shot-classification',
101
+ ... model='./tmp/checkpoint-28832'
102
+ ... )
103
+ >>> text = 'Resep sate ayam enak dan mudah.'
104
+ >>> candidate_labels = ['kuliner', 'olahraga']
105
+ >>> pipe(
106
+ ... sequences=text,
107
+ ... candidate_labels=candidate_labels,
108
+ ... hypothesis_template='Kategori judul artikel ini adalah {}.',
109
+ ... multi_label=False
110
+ ... )
111
+ {
112
+ 'sequence': 'Resep sate ayam enak dan mudah.',
113
+ 'labels': ['kuliner', 'olahraga'],
114
+ 'scores': [0.7711364030838013, 0.22886358201503754]
115
+ }
116
+ >>> pipe(
117
+ ... sequences=text,
118
+ ... candidate_labels=candidate_labels,
119
+ ... hypothesis_template='Kelas kalimat ini {}.',
120
+ ... multi_label=False
121
+ ... )
122
+ {
123
+ 'sequence': 'Resep sate ayam enak dan mudah.',
124
+ 'labels': ['kuliner', 'olahraga'],
125
+ 'scores': [0.7043636441230774, 0.295636385679245]
126
+ }
127
+ >>> pipe(
128
+ ... sequences=text,
129
+ ... candidate_labels=candidate_labels,
130
+ ... hypothesis_template='{}.',
131
+ ... multi_label=False
132
+ ... )
133
+ {
134
+ 'sequence': 'Resep sate ayam enak dan mudah.',
135
+ 'labels': ['kuliner', 'olahraga'],
136
+ 'scores': [0.5986711382865906, 0.4013288915157318]
137
+ }
138
+
139
+ ```
140
+
141
+ ## Training, evaluation and testing data
142
+
143
+ This model was finetuned with [IndoNLI](https://huggingface.co/datasets/indonli) and [multilingual-NLI-26lang-2mil7](https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7). Although `multilingual-NLI-26lang-2mil7` dataset is machine-translated, this dataset slightly improve result of NLI benchmark and extensively improve result of ZSC benchmark. Both evaluation and testing data is only based on IndoNLI dataset.
144
+
145
+ ## Training Procedure
146
+
147
+ The model was finetuned on single RTX 3060 with 16 epoch/28832 steps with accumulated batch size 64. AdamW optimizer is used with LR 1e-4, weight decay 0.05, learning rate warmup for first 6% steps (1730 steps) and linear decay of the learning rate afterwards. Take note while model weight on epoch 9 has lowest loss/highest accuracy, it has slightly lower performance on ZSC benchmark. Additional information can be seen on Tensorboard training logs.
148
+
149
+ ## Benchmark as NLI model
150
+
151
+ Both benchmark show result of 2 different model as additional comparison. Additional benchmark using IndoNLI dataset is available on it's paper [IndoNLI: A Natural Language Inference Dataset for Indonesian](https://aclanthology.org/2021.emnlp-main.821/).
152
+
153
+ | Model | bigbird-small-indonesian-nli | xlm-roberta-large-xnli | mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 |
154
+ | ------------------------------------------ | ---------------------------- | ---------------------- | -------------------------------------------- |
155
+ | Parameter count | 30.6M | 559.9M | 278.8M |
156
+ | Multilingual | | V | V |
157
+ | Finetuned on IndoNLI | V | | V |
158
+ | Finetuned on multilingual-NLI-26lang-2mil7 | V | | |
159
+ | Test (Lay) | 0.6888 | 0.2226 | 0.8151 |
160
+ | Test (Expert) | 0.5734 | 0.3505 | 0.7775 |
161
+
162
+ ## Benchmark as ZSC model
163
+
164
+ [Indonesian-Twitter-Emotion-Dataset](https://github.com/meisaputri21/Indonesian-Twitter-Emotion-Dataset/) is used to perform ZSC benchmark. This benchmark include 4 different parameter which affect performance of each model differently. Hypothesis template for this benchmark is `Kalimat ini mengekspresikan perasaan {}.` and `{}.`. Take note F1 score measurement only calculate label with highest probability.
165
+
166
+ | Model | Multi-label | Use template | F1 Score |
167
+ | -------------------------------------------- | ----------- | ------------ | ------------ |
168
+ | mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 | V | V | 0.3574 |
169
+ | | V | | 0.3654 |
170
+ | | | V | 0.3985 |
171
+ | | | | _0.4160_ |
172
+ | xlm-roberta-large-xnli | V | V | _**0.6292**_ |
173
+ | | V | | 0.5596 |
174
+ | | | V | 0.5737 |
175
+ | | | | 0.5433 |
176
+ | bigbird-small-indonesian-nli | V | V | 0.5324 |
177
+ | | V | | _0.5499_ |
178
+ | | | V | 0.5269 |
179
+ | | | | 0.5228 |
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:99d7283876c3bfeeb0248e9b29019683c61e8852e325ded3937f8cba2c4d115c
3
  size 122439617
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7dd660ec1ad44f03e6b89f7601c445e24b9a8905863185b183591c21c3773412
3
  size 122439617
runs/sanitzed_log/events.out.tfevents.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:51a519f546ec054b68522c20514f856fd3d560d7330699c5de4e1ade098eb864
3
+ size 93238