Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ thumbnail: https://gsarti.com/publication/it5/featured.png
|
|
15 |
|
16 |
The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
|
17 |
|
18 |
-
This model is released as part of the project ["IT5:
|
19 |
|
20 |
*The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
|
21 |
|
@@ -77,12 +77,22 @@ For problems or updates on this model, please contact [[email protected]
|
|
77 |
## Citation Information
|
78 |
|
79 |
```bibtex
|
80 |
-
@
|
81 |
-
title={
|
82 |
-
author=
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
87 |
}
|
88 |
```
|
|
|
15 |
|
16 |
The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
|
17 |
|
18 |
+
This model is released as part of the project ["IT5: Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://aclanthology.org/2024.lrec-main.823/), by [Gabriele Sarti](https://gsarti.com/) and [Malvina Nissim](https://malvinanissim.github.io/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
|
19 |
|
20 |
*The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
|
21 |
|
|
|
77 |
## Citation Information
|
78 |
|
79 |
```bibtex
|
80 |
+
@inproceedings{sarti-nissim-2024-it5-text,
|
81 |
+
title = "{IT}5: Text-to-text Pretraining for {I}talian Language Understanding and Generation",
|
82 |
+
author = "Sarti, Gabriele and
|
83 |
+
Nissim, Malvina",
|
84 |
+
editor = "Calzolari, Nicoletta and
|
85 |
+
Kan, Min-Yen and
|
86 |
+
Hoste, Veronique and
|
87 |
+
Lenci, Alessandro and
|
88 |
+
Sakti, Sakriani and
|
89 |
+
Xue, Nianwen",
|
90 |
+
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
|
91 |
+
month = may,
|
92 |
+
year = "2024",
|
93 |
+
address = "Torino, Italia",
|
94 |
+
publisher = "ELRA and ICCL",
|
95 |
+
url = "https://aclanthology.org/2024.lrec-main.823",
|
96 |
+
pages = "9422--9433",
|
97 |
}
|
98 |
```
|