cwinkler's picture
Update README.md
4d5ab66
|
raw
history blame
2.92 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1
model-index:
  - name: distilbert-base-uncased-finetuned-greenplastics-3
    results: []
widget:
  - text: >-
      The present disclosure relates to a process for recycling of plastic waste
      comprising: segregating plastic waste collected from various sources
      followed by cleaning of the segregated plastic waste to obtain segregated
      cleaned waste; grinding of the segregated cleaned waste to obtain grinded
      waste; introducing the grinded waste into an extrusion line having a
      venting extruder component as part of the extrusion line, to obtain molten
      plastic; and removing the impurities by vacuum venting of the molten
      plastic to obtained recycled plastic free from impurities. The present
      disclosure further relates to various articles like Industrial Post
      Recycled (IPR) plastic tubes, blow moulded bottles, pallates, manufactured
      from the recycled plastic waste.
language:
  - en
pipeline_tag: text-classification
library_name: transformers

Classification of patent abstracts - "Green Plastics" or "No Green Plastics"

This model (distilbert-base-uncased-finetuned-greenplastics-3) classifies patents into "green plastics" or "no green plastics" by their abstracts.

The model is a fine-tuned version of distilbert-base-uncased on the green plastics dataset. The green patent dataset was split into 70 % training data and 30 % test data (using ".train_test_split(test_size=0.3)"). The model achieves the following results on the evaluation set:

  • Accuracy: 0.8574
  • F1: 0.8573

EPO - CodeFest on Green Plastics

The model has been developed for submission to the CodeFest on Green Plastics by the European Patent Office (EPO).

The task:

"To develop creative and reliable artificial intelligence (AI) models for automating the identification of patents related to green plastics."

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 200

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 0.2 200 0.3435 0.8574 0.8573

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0+cu116
  • Datasets 2.8.0
  • Tokenizers 0.13.2