RJuro commited on
Commit
c63da06
1 Parent(s): e9f6a38

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md CHANGED
@@ -1,3 +1,74 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
2
  license: cc-by-4.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: da
3
+ tags:
4
+ - danish
5
+ - bert
6
+ - sentiment
7
+ - text-classification
8
+ - Maltehb/danish-bert-botxo
9
+ - Helsinki-NLP/opus-mt-en-da
10
+ - go-emotion
11
+ - Certainly
12
  license: cc-by-4.0
13
+ datasets:
14
+ - go_emotions
15
+ metrics:
16
+ - Accuracy
17
+ widget:
18
+ - text: "Det er så sødt af dig at tænke på andre på den måde ved du det?"
19
+ - text: "Jeg vil gerne have en playstation."
20
+ - text: "Jeg elsker dig"
21
  ---
22
+
23
+ # Danish-Bert-GoÆmotion
24
+
25
+ Danish Go-Emotion classifier. [Maltehb/danish-bert-botxo](https://huggingface.co/Maltehb/danish-bert-botxo) (uncased) finetuned on a translation of the [go_emotion](https://huggingface.co/datasets/go_emotions) dataset using [Helsinki-NLP/opus-mt-en-da](https://huggingface.co/Helsinki-NLP/opus-mt-de-en). Thus,performance is obviousely only as good as the translation model.
26
+
27
+
28
+ ## Training Parameters:
29
+
30
+ ```
31
+ Num examples = 189900
32
+ Num Epochs = 3
33
+ Train batch = 8
34
+ Eval batch = 8
35
+ Learning Rate = 3e-5
36
+ Warmup steps = 4273
37
+ Total optimization steps = 71125
38
+ ```
39
+
40
+ ## Loss
41
+ ### Training loss
42
+ ![](wb_loss.png)
43
+
44
+ ### Eval. loss
45
+ ```
46
+ 0.1178 (21100 examples)
47
+ ```
48
+
49
+
50
+ ## Using the model with `transformers`
51
+ Easiest use with `transformers` and `pipeline`:
52
+
53
+ ```python
54
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
55
+
56
+ model = AutoModelForSequenceClassification.from_pretrained('RJuro/danish-bert-go-aemotion')
57
+ tokenizer = AutoTokenizer.from_pretrained('RJuro/danish-bert-go-aemotion')
58
+
59
+ classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
60
+
61
+ classifier('jeg elsker dig')
62
+ ```
63
+
64
+ `[{'label': 'kærlighed', 'score': 0.9634820818901062}]`
65
+
66
+ ## Using the model with `simpletransformers`
67
+
68
+ ```python
69
+ from simpletransformers.classification import MultiLabelClassificationModel
70
+
71
+ model = MultiLabelClassificationModel('bert', 'RJuro/danish-bert-go-aemotion')
72
+
73
+ predictions, raw_outputs = model.predict(df['text'])
74
+ ```