julien-c HF staff commited on
Commit
270e45a
1 Parent(s): 52d84e5

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/amine/bert-base-5lang-cased/README.md

Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - es
6
+ - de
7
+ - zh
8
+
9
+ tags:
10
+ - pytorch
11
+ - bert
12
+ - multilingual
13
+ - en
14
+ - fr
15
+ - es
16
+ - de
17
+ - zh
18
+
19
+ datasets: wikipedia
20
+
21
+ license: apache-2.0
22
+
23
+ inference: false
24
+ ---
25
+
26
+ # bert-base-5lang-cased
27
+ This is a smaller version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) that handles only 5 languages (en, fr, es, de and zh) instead of 104.
28
+ The model is therefore 30% smaller than the original one (124M parameters instead of 178M) but gives exactly the same representations for the above cited languages.
29
+ Starting from `bert-base-5lang-cased` will facilitate the deployment of your model on public cloud platforms while keeping similar results.
30
+ For instance, Google Cloud Platform requires that the model size on disk should be lower than 500 MB for serveless deployments (Cloud Functions / Cloud ML) which is not the case of the original `bert-base-multilingual-cased`.
31
+
32
+ For more information about the models size, memory footprint and loading time please refer to the table below:
33
+
34
+ | Model | Num parameters | Size | Memory | Loading time |
35
+ | ---------------------------- | -------------- | -------- | -------- | ------------ |
36
+ | bert-base-multilingual-cased | 178 million | 714 MB | 1400 MB | 4.2 sec |
37
+ | bert-base-5lang-cased | 124 million | 495 MB | 950 MB | 3.6 sec |
38
+
39
+ These measurements have been computed on a [Google Cloud n1-standard-1 machine (1 vCPU, 3.75 GB)](https://cloud.google.com/compute/docs/machine-types\#n1_machine_type).
40
+
41
+ ## How to use
42
+
43
+ ```python
44
+ from transformers import AutoTokenizer, AutoModel
45
+
46
+ tokenizer = AutoTokenizer.from_pretrained("amine/bert-base-5lang-cased")
47
+ model = AutoModel.from_pretrained("amine/bert-base-5lang-cased")
48
+
49
+ ```
50
+
51
+ ### How to cite
52
+
53
+ ```bibtex
54
+ @inproceedings{smallermbert,
55
+ title={Load What You Need: Smaller Versions of Mutlilingual BERT},
56
+ author={Abdaoui, Amine and Pradel, Camille and Sigel, Grégoire},
57
+ booktitle={SustaiNLP / EMNLP},
58
+ year={2020}
59
+ }
60
+ ```
61
+
62
+ ## Contact
63
+
64
+ Please contact [email protected] for any question, feedback or request.