jlehecka commited on
Commit
549a79c
1 Parent(s): e2f223b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -12
README.md CHANGED
@@ -8,11 +8,12 @@ license: "cc-by-nc-sa-4.0"
8
  ---
9
 
10
  # wav2vec2-base-sk-17k
11
- This is a monolingual Slovak Wav2Vec 2.0 base model pre-trained from 17 thousand of hours of Slovak speech.
 
12
 
13
  This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created, and the model should be fine-tuned on labeled data.
14
 
15
- The model was initialized from Czech pre-trained model [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS). We found this cross-language transfer learning approach better than pre-training from scratch. See our paper for details.
16
 
17
  ## Pretraining data
18
  Almost 18 thousand hours of unlabeled Slovak speech:
@@ -51,29 +52,35 @@ After fine-tuning, the model scored the following results on public datasets:
51
  See our paper for details.
52
 
53
  ## Paper
54
- The preprint of our paper (accepted to TSD 2023) is available at https://arxiv.org/abs/2306.04399.
 
 
55
 
56
  ## Citation
57
  If you find this model useful, please cite our paper:
58
  ```
59
  @inproceedings{wav2vec2-base-sk-17k,
60
- title = {{Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak}},
61
  author = {
62
- Jan Lehe\v{c}ka and
63
- Josef V. Psutka and
64
- Josef Psutka
65
  },
66
- booktitle = {{Text, Speech, and Dialogue}},
67
- publisher = {{Springer International Publishing}},
68
  year = {2023},
69
- note = {(in press)},
70
- url = {https://arxiv.org/abs/2306.04399},
 
 
 
 
 
 
71
  }
72
  ```
73
 
74
  ## Related papers
75
  - [INTERSPEECH 2022 - Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech](https://www.isca-speech.org/archive/pdfs/interspeech_2022/lehecka22_interspeech.pdf)
76
- - INTERSPEECH 2023 - Transformer-based Speech Recognition Models for Oral History Archives in English, German, and Czech
77
 
78
  ## Related models
79
  - [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS)
 
8
  ---
9
 
10
  # wav2vec2-base-sk-17k
11
+ This is a monolingual Slovak Wav2Vec 2.0 base model pre-trained from 17 thousand hours of Slovak speech.
12
+ It was introduced in the paper **Transfer Learning of Transformer-Based Speech Recognition Models from Czech to Slovak** accepted for the TSD2023 conference.
13
 
14
  This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created, and the model should be fine-tuned on labeled data.
15
 
16
+ The model was initialized from the Czech pre-trained model [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS). We found this cross-language transfer learning approach better than pre-training from scratch. See our paper for details.
17
 
18
  ## Pretraining data
19
  Almost 18 thousand hours of unlabeled Slovak speech:
 
52
  See our paper for details.
53
 
54
  ## Paper
55
+ The paper is available at https://link.springer.com/chapter/10.1007/978-3-031-40498-6_29.
56
+
57
+ The pre-print of our paper is available at https://arxiv.org/abs/2306.04399.
58
 
59
  ## Citation
60
  If you find this model useful, please cite our paper:
61
  ```
62
  @inproceedings{wav2vec2-base-sk-17k,
 
63
  author = {
64
+ Lehe\v{c}ka, Jan and
65
+ Psutka, Josef V. and
66
+ Psutka, Josef
67
  },
68
+ title = {{Transfer Learning of Transformer-Based Speech Recognition Models from Czech to Slovak}},
 
69
  year = {2023},
70
+ isbn = {978-3-031-40497-9},
71
+ publisher = {Springer Nature Switzerland},
72
+ address = {Cham},
73
+ url = {https://doi.org/10.1007/978-3-031-40498-6_29},
74
+ doi = {10.1007/978-3-031-40498-6_29},
75
+ booktitle = {Text, Speech, and Dialogue: 26th International Conference, TSD 2023, Pilsen, Czech Republic, September 4–6, 2023, Proceedings},
76
+ pages = {328–338},
77
+ numpages = {11},
78
  }
79
  ```
80
 
81
  ## Related papers
82
  - [INTERSPEECH 2022 - Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech](https://www.isca-speech.org/archive/pdfs/interspeech_2022/lehecka22_interspeech.pdf)
83
+ - [INTERSPEECH 2023 - Transformer-based Speech Recognition Models for Oral History Archives in English, German, and Czech](https://www.isca-archive.org/interspeech_2023/lehecka23_interspeech.pdf)
84
 
85
  ## Related models
86
  - [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS)