HeyLucasLeao
/

byt5-small-pt-product-reviews

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

HeyLucasLeao commited on Jul 2, 2021

Commit

ec5f659

•

1 Parent(s): 8c66b04

Update README.md

Files changed (1) hide show

README.md +68 -1

README.md CHANGED Viewed

	@@ -1 +1,68 @@
1	- ~~test~~

+Create README.md
+## ByT5 Small Portuguese Product Reviews
+#### Model Description
+This is a finetuned version from ByT5 by Google for Sentimental Analysis from Product Reviews in Portuguese.
+#### Training data
+It was trained from products reviews from a Americanas.com. You can found the data here: https://github.com/b2wdigital/b2w-reviews01.
+#### Training Procedure
+It was finetuned using the Trainer Class available on the Hugging Face library. For evaluation it was used accuracy, precision, recall and f1 score.
+##### Learning Rate: **2e-4**
+##### Epochs: **1**
+##### Colab for Finetuning: https://colab.research.google.com/drive/1EChTeQkGeXi_52lClBNazHVuSNKEHN2f
+##### Colab for Metrics: https://colab.research.google.com/drive/1o4tcsP3lpr1TobtE3Txhp9fllxPWXxlw#scrollTo=PXAoog5vQaTn
+#### Score:
+```python
+Training Set:
+'accuracy': 0.8699743370402053,
+'f1': 0.9072110777980404,
+'precision': 0.9432919284600922,
+'recall': 0.8737887200250071
+Test Set:
+'accuracy': 0.8680854858365782,
+'f1': 0.9058389204786557,
+'precision': 0.9420980625799903,
+'recall': 0.8722673967229191
+Validation Set:
+'accuracy': 0.8662624220987031,
+'f1': 0.9042450554751569,
+'precision': 0.9436194311603322,
+'recall': 0.8680250057883769
+```
+#### Goals
+My true intention was totally educational, thus making available a this version of the model as a example for future proposes.
+How to use
+``` python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+if torch.cuda.is_available():
+    device = torch.device('cuda')
+else:
+    device = torch.device('cpu')
+print(device)
+tokenizer = AutoTokenizer.from_pretrained("HeyLucasLeao/byt5-small-pt-product-reviews")
+model = AutoModelForCausalLM.from_pretrained("HeyLucasLeao/byt5-small-pt-product-reviews")
+model.to(device)
+def classificar_review(review):
+  inputs = tokenizer([review], padding='max_length', truncation=True, max_length=512, return_tensors='pt')
+  input_ids = inputs.input_ids.to(device)
+  attention_mask = inputs.attention_mask.to(device)
+  output = model.generate(input_ids, attention_mask=attention_mask)
+  pred = np.argmax(output.cpu(), axis=1)
+  dici = {0: 'Review Negativo', 1: 'Review Positivo'}
+  return dici[pred.item()]
+classificar_review(review)
+```