dougtrajano
/

toxicity-target-type-identification

@@ -10,38 +10,73 @@ metrics:
 model-index:
 - name: toxicity-target-type-identification
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # toxicity-target-type-identification
-This model is a fine-tuned version of [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.7001
-- Accuracy: 0.7505
-- F1: 0.7603
-- Precision: 0.7813
-- Recall: 0.7505
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 3.952388499692274e-05
 - train_batch_size: 8
 - eval_batch_size: 8
@@ -50,18 +85,13 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 30
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     | Precision | Recall |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
-| No log        | 1.0   | 355  | 0.7001          | 0.7505   | 0.7603 | 0.7813    | 0.7505 |
-| 0.7919        | 2.0   | 710  | 1.0953          | 0.7505   | 0.7452 | 0.7590    | 0.7505 |
-| 0.5218        | 3.0   | 1065 | 1.4217          | 0.7484   | 0.7551 | 0.7688    | 0.7484 |
 ### Framework versions
 - Transformers 4.26.1
 - Pytorch 1.10.2+cu113
 - Datasets 2.9.0
 - Tokenizers 0.13.2

 model-index:
 - name: toxicity-target-type-identification
   results: []
+datasets:
+- dougtrajano/olid-br
+language:
+- pt
+library_name: transformers
 ---
 # toxicity-target-type-identification
+Toxicity Target Type Identification is a model that classifies the type (individual, group, or other) of a given targeted text.
+This BERT model is a fine-tuned version of [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) on the [OLID-BR dataset](https://huggingface.co/datasets/dougtrajano/olid-br).
+## Overview
+**Input:** Text in Brazilian Portuguese
+**Output:** Multiclass classification (individual, group, or other)
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+tokenizer = AutoTokenizer.from_pretrained("dougtrajano/toxicity-target-type-identification")
+model = AutoModelForSequenceClassification.from_pretrained("dougtrajano/toxicity-target-type-identification")
+```
+## Limitations and bias
+The following factors may degrade the model’s performance.
+**Text Language**:  The model was trained on Brazilian Portuguese texts, so it may not work well with Portuguese dialects.
+**Text Origin**: The model was trained on texts from social media and a few texts from other sources, so it may not work well on other types of texts.
+## Trade-offs
+Sometimes models exhibit performance issues under particular circumstances. In this section, we'll discuss situations in which you might discover that the model performs less than optimally, and should plan accordingly.
+**Text Length**: The model was fine-tuned on texts with a word count between 1 and 178 words (average of 18 words). It may give poor results on texts with a word count outside this range.
+## Performance
+The model was evaluated on the test set of the [OLID-BR](https://dougtrajano.github.io/olid-br/) dataset.
+**Accuracy:** 0.7505
+**Precision:** 0.7812
+**Recall:** 0.7505
+**F1-Score:** 0.7603
+| Class | Precision | Recall | F1-Score | Support |
+| :---: | :-------: | :----: | :------: | :-----: |
+| `INDIVIDUAL` | 0.8850 | 0.7964 | 0.8384 | 609 |
+| `GROUP` | 0.6766 | 0.6385 | 0.6570 | 213 |
+| `OTHER` | 0.4518 | 0.7177 | 0.5545 | 124 |
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 3.952388499692274e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - lr_scheduler_type: linear
 - num_epochs: 30
 ### Framework versions
 - Transformers 4.26.1
 - Pytorch 1.10.2+cu113
 - Datasets 2.9.0
 - Tokenizers 0.13.2
+## Provide Feedback
+If you have any feedback on this model, please [open an issue](https://github.com/DougTrajano/ToChiquinho/issues/new) on GitHub.