cwinkler commited on
Commit
51fe2f0
1 Parent(s): 6bb46db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -34,12 +34,14 @@ should probably proofread and complete it, then remove this comment. -->
34
 
35
  This model (distilbert-base-uncased-finetuned-greenplastics-3) classifies patents into "green plastics" or "no green plastics" by their abstracts.
36
 
37
- The model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [green plastics dataset](https://huggingface.co/datasets/cwinkler/patents_green_plastics). The green patent dataset was split into 70 % training data and 30 % test data (using ".train_test_split(test_size=0.3)").
38
  The model achieves the following results on the evaluation set:
39
 
40
  - Accuracy: 0.8574
41
  - F1: 0.8573
42
 
 
 
43
  ## EPO - CodeFest on Green Plastics
44
 
45
  The model has been developed for submission to the [CodeFest on Green Plastics](https://www.epo.org/news-events/in-focus/codefest.html) by the European Patent Office (EPO).
 
34
 
35
  This model (distilbert-base-uncased-finetuned-greenplastics-3) classifies patents into "green plastics" or "no green plastics" by their abstracts.
36
 
37
+ The model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [green plastics dataset](https://huggingface.co/datasets/cwinkler/patents_green_plastics) (11.196 samples of patent abstracts). The green plastics dataset was split into 70 % training data and 30 % test data (using ".train_test_split(test_size=0.3)").
38
  The model achieves the following results on the evaluation set:
39
 
40
  - Accuracy: 0.8574
41
  - F1: 0.8573
42
 
43
+ The maximum number of taining steps was set to 200 to avoid overfitting. I considered an accuracy of 0.8574 to be suitable for the task. Further training would lead to a high accuracy but testing the final model with random examples was not really satisfying. That is why I chose to limit the training steps.
44
+
45
  ## EPO - CodeFest on Green Plastics
46
 
47
  The model has been developed for submission to the [CodeFest on Green Plastics](https://www.epo.org/news-events/in-focus/codefest.html) by the European Patent Office (EPO).