File size: 3,862 Bytes
b01f976 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a b01f976 e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a e11ee6c 32f9b4a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
---
language:
- it
license: apache-2.0
tags:
- italian
- sequence-to-sequence
- style-transfer
- efficient
- formality-style-transfer
datasets:
- yahoo/xformal_it
widget:
- text: "Questa performance è a dir poco spiacevole."
- text: "In attesa di un Suo cortese riscontro, Le auguriamo un piacevole proseguimento di giornata."
- text: "Questa visione mi procura una goduria indescrivibile."
- text: "qualora ciò possa interessarti, ti pregherei di contattarmi."
metrics:
- rouge
- bertscore
model-index:
- name: it5-efficient-small-el32-formal-to-informal
results:
- task:
type: formality-style-transfer
name: "Formal-to-informal Style Transfer"
dataset:
type: xformal_it
name: "XFORMAL (Italian Subset)"
metrics:
- type: rouge1
value: 0.459
name: "Avg. Test Rouge1"
- type: rouge2
value: 0.244
name: "Avg. Test Rouge2"
- type: rougeL
value: 0.435
name: "Avg. Test RougeL"
- type: bertscore
value: 0.739
name: "Avg. Test BERTScore"
---
# IT5 Cased Small Efficient EL32 for Formal-to-informal Style Transfer 🤗
*Shout-out to [Stefan Schweter](https://github.com/stefan-it) for contributing the pre-trained efficient model!*
This repository contains the checkpoint for the [IT5 Cased Small Efficient EL32](https://huggingface.co/it5/it5-efficient-small-el32)
model fine-tuned on Formal-to-informal style transfer on the Italian subset of the XFORMAL dataset as part of the experiments of the paper [IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation](https://arxiv.org/abs/2203.03759) by [Gabriele Sarti](https://gsarti.com) and [Malvina Nissim](https://malvinanissim.github.io).
Efficient IT5 models differ from the standard ones by adopting a different vocabulary that enables cased text generation and an [optimized model architecture](https://arxiv.org/abs/2109.10686) to improve performances while reducing parameter count. The Small-EL32 replaces the original encoder from the T5 Small architecture with a 32-layer deep encoder, showing improved performances over the base model.
A comprehensive overview of other released materials is provided in the [gsarti/it5](https://github.com/gsarti/it5) repository. Refer to the paper for additional details concerning the reported scores and the evaluation approach.
## Using the model
Model checkpoints are available for usage in Tensorflow, Pytorch and JAX. They can be used directly with pipelines as:
```python
from transformers import pipelines
f2i = pipeline("text2text-generation", model='it5/it5-efficient-small-el32-formal-to-informal')
f2i("Vi ringrazio infinitamente per vostra disponibilità")
>>> [{"generated_text": "e grazie per la vostra disponibilità!"}]
```
or loaded using autoclasses:
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("it5-efficient-small-el32-formal-to-informal")
model = AutoModelForSeq2SeqLM.from_pretrained("it5-efficient-small-el32-formal-to-informal")
```
If you use this model in your research, please cite our work as:
```bibtex
@article{sarti-nissim-2022-it5,
title={{IT5}: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
author={Sarti, Gabriele and Nissim, Malvina},
journal={ArXiv preprint 2203.03759},
url={https://arxiv.org/abs/2203.03759},
year={2022},
month={mar}
}
```
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10.0
### Framework versions
- Transformers 4.15.0
- Pytorch 1.10.0+cu102
- Datasets 1.17.0
- Tokenizers 0.10.3 |