Add model card
Browse files
README.md
CHANGED
@@ -1,3 +1,50 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-nc-4.0
|
3 |
+
language:
|
4 |
+
- de
|
5 |
+
- fr
|
6 |
+
- it
|
7 |
+
- rm
|
8 |
+
- multilingual
|
9 |
+
inference: false
|
10 |
---
|
11 |
+
|
12 |
+
SwissBERT is a masked language model for processing Switzerland-related text. It has been trained on more than 21 million Swiss news articles retrieved from [Swissdox@LiRI](https://t.uzh.ch/1hI).
|
13 |
+
|
14 |
+
SwissBERT is based on [X-MOD](https://huggingface.co/facebook/xmod-base), which has been pre-trained with language adapters in 81 languages.
|
15 |
+
For SwissBERT we trained adapters for the national languages of Switzerland – German, French, Italian, and Romansh Grischun.
|
16 |
+
In addition, we used a Switzerland-specific subword vocabulary.
|
17 |
+
|
18 |
+
The pre-training code and usage examples are available [here](https://github.com/ZurichNLP/swissbert). We also release a version that was fine-tuned on named entity recognition (NER): https://huggingface.co/ZurichNLP/swissbert-ner
|
19 |
+
|
20 |
+
## Languages
|
21 |
+
|
22 |
+
SwissBERT contains the following language adapters:
|
23 |
+
|
24 |
+
| lang_id (Adapter index) | Language code | Language |
|
25 |
+
|-------------------------|---------------|-----------------------|
|
26 |
+
| 0 | `de_CH` | Swiss Standard German |
|
27 |
+
| 1 | `fr_CH` | French |
|
28 |
+
| 2 | `it_CH` | Italian |
|
29 |
+
| 3 | `rm_CH` | Romansh Grischun |
|
30 |
+
|
31 |
+
## License
|
32 |
+
Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
|
33 |
+
|
34 |
+
## Bias, Risks, and Limitations
|
35 |
+
- SwissBERT is mainly intended for tagging tokens in written text (e.g., named entity recognition, part-of-speech tagging), text classification, and the encoding of words, sentences or documents into fixed-size embeddings.
|
36 |
+
SwissBERT is not designed for generating text.
|
37 |
+
- The model was adapted on written news articles and might perform worse on other domains or language varieties.
|
38 |
+
- While we have removed many author bylines, we did not anonymize the pre-training corpus. The model might have memorized information that has been described in the news but is no longer in the public interest.
|
39 |
+
|
40 |
+
## Training Details
|
41 |
+
- Training data: German, French, Italian and Romansh documents in the [Swissdox@LiRI](https://t.uzh.ch/1hI) database, until 2022.
|
42 |
+
- Training procedure: Masked language modeling
|
43 |
+
|
44 |
+
## Environmental Impact
|
45 |
+
- Hardware type: RTX 2080 Ti.
|
46 |
+
- Hours used: 10 epochs × 18 hours × 8 devices = 1440 hours
|
47 |
+
- Site: Zurich, Switzerland.
|
48 |
+
- Energy source: 100% hydropower ([source](https://t.uzh.ch/1rU))
|
49 |
+
- Carbon efficiency: 0.0016 kg CO2e/kWh ([source](https://t.uzh.ch/1rU))
|
50 |
+
- Carbon emitted: 0.6 kg CO2e ([source](https://mlco2.github.io/impact#compute))
|