jorgeortizfuentes
commited on
Commit
•
ef866e7
1
Parent(s):
51f5d49
Update README.md
Browse files
README.md
CHANGED
@@ -1,37 +1,36 @@
|
|
1 |
---
|
2 |
tags:
|
3 |
- generated_from_trainer
|
|
|
|
|
4 |
datasets:
|
5 |
- jorgeortizfuentes/chilean-spanish-corpus
|
6 |
model-index:
|
7 |
-
- name: patana-chilean-
|
8 |
results: []
|
|
|
|
|
|
|
|
|
9 |
---
|
10 |
|
11 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
should probably proofread and complete it, then remove this comment. -->
|
13 |
|
14 |
-
#
|
15 |
|
16 |
-
|
17 |
|
18 |
-
##
|
19 |
|
20 |
-
|
21 |
|
22 |
-
|
23 |
|
24 |
-
|
25 |
|
26 |
-
|
27 |
|
28 |
-
More information needed
|
29 |
-
|
30 |
-
## Training procedure
|
31 |
-
|
32 |
-
### Training hyperparameters
|
33 |
-
|
34 |
-
The following hyperparameters were used during training:
|
35 |
- learning_rate: 2e-05
|
36 |
- train_batch_size: 64
|
37 |
- eval_batch_size: 64
|
@@ -40,13 +39,53 @@ The following hyperparameters were used during training:
|
|
40 |
- lr_scheduler_type: constant
|
41 |
- num_epochs: 1
|
42 |
|
43 |
-
### Training
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
|
|
45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
-
|
|
|
48 |
|
49 |
- Transformers 4.30.2
|
50 |
- Pytorch 2.0.1+cu117
|
51 |
- Datasets 2.13.1
|
52 |
- Tokenizers 0.13.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
tags:
|
3 |
- generated_from_trainer
|
4 |
+
- chilean spanish
|
5 |
+
- español chileno
|
6 |
datasets:
|
7 |
- jorgeortizfuentes/chilean-spanish-corpus
|
8 |
model-index:
|
9 |
+
- name: patana-chilean-spanish-bert
|
10 |
results: []
|
11 |
+
license: cc-by-4.0
|
12 |
+
language:
|
13 |
+
- es
|
14 |
+
pipeline_tag: fill-mask
|
15 |
---
|
16 |
|
17 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
18 |
should probably proofread and complete it, then remove this comment. -->
|
19 |
|
20 |
+
# Patana Chilean Spanish BERT
|
21 |
|
22 |
+
Este modelo es una versión fine-tuneada del modelo [dccuchile/bert-base-spanish-wwm-cased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased) con español de Chile y español multidialectal.
|
23 |
|
24 |
+
## Descripción
|
25 |
|
26 |
+
Patana ha sido entrenada con el [Chilean Spanish Corpus](https://huggingface.co/datasets/jorgeortizfuentes/chilean-spanish-corpus). Este corpus se compone de textos en español de Chile (noticias, web, reclamos y tweets).
|
27 |
|
28 |
+
Este modelo se caracteriza por presentar resultados sobresalientes respecto a otros modelos BERT en español (y su familia) en tareas que involucren español de Chile.
|
29 |
|
30 |
+
### Hiperparámetros de entrenamiento
|
31 |
|
32 |
+
Los siguientes hiperparámetros fueron usados durante el entrenamiento:
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
- learning_rate: 2e-05
|
35 |
- train_batch_size: 64
|
36 |
- eval_batch_size: 64
|
|
|
39 |
- lr_scheduler_type: constant
|
40 |
- num_epochs: 1
|
41 |
|
42 |
+
### Training Loss
|
43 |
+
|
44 |
+
Epoch | Training Loss
|
45 |
+
--- | ---
|
46 |
+
0.1 | 1.4046
|
47 |
+
0.2 | 1.3729
|
48 |
+
0.3 | 1.3504
|
49 |
+
0.4 | 1.3312
|
50 |
+
0.5 | 1.3171
|
51 |
+
0.6 | 1.3048
|
52 |
+
0.7 | 1.2958
|
53 |
+
1.0 | 1.3722
|
54 |
|
55 |
+
### Evaluación y comparativa con otros modelos en español
|
56 |
|
57 |
+
| Modelo | Text classification task (en español de Chile) | Token classification task (en español de Chile) |
|
58 |
+
|----------------|------------------------------------------------------|----------------------------------------------------|
|
59 |
+
| Beto (BERT Spanish) | 0.8392 | 0.7544 |
|
60 |
+
| Bertin Roberta Base | 0.8325 | - |
|
61 |
+
| Roberta Large BNE | 0.8499 | 0.7697 |
|
62 |
+
| Tulio BERT | **0.8503** | **0.7815** |
|
63 |
+
| Patana BERT | 0.8435 | 0.7777 |
|
64 |
|
65 |
+
|
66 |
+
### Frameworks de entrenamiento
|
67 |
|
68 |
- Transformers 4.30.2
|
69 |
- Pytorch 2.0.1+cu117
|
70 |
- Datasets 2.13.1
|
71 |
- Tokenizers 0.13.3
|
72 |
+
|
73 |
+
## Agradecimientos
|
74 |
+
|
75 |
+
Agradecemos al [Departamento de Ciencias de la Computación de la Universidad de Chile](https://www.dcc.uchile.cl/) y a [ReLeLa](https://relela.com/) por los servidores proporcionados para el entrenamiento del modelo. También agradecemos por su apoyo al [Instituto Milenio Fundamentos de los Datos](https://imfd.cl/).
|
76 |
+
|
77 |
+
## Licencia
|
78 |
+
|
79 |
+
La licencia CC BY 4.0 es la que mejor describe las intenciones de nuestro trabajo. Sin embargo, no estamos seguros de que todos datos utilizados para entrenar este modelo tengan licencias compatibles con CC BY 4.0 (especialmente para uso comercial).
|
80 |
+
|
81 |
+
## Limitaciones
|
82 |
+
|
83 |
+
El dataset de entrenamiento no recibió ningún tipo de censura. Por lo tanto, el modelo puede contener representaciones ideológicas no deseadas. Utilizar con precaución.
|
84 |
+
|
85 |
+
## Autor
|
86 |
+
|
87 |
+
Modelo entrenado y datasets recopilados por [Jorge Ortiz Fuentes](https://ortizfuentes.com)
|
88 |
+
|
89 |
+
## Citación
|
90 |
+
|
91 |
+
Pendiente
|