--- language: en license: apache-2.0 datasets: - ESGBERT/action_500 tags: - ESG - environmental - action --- # Model Card for EnvironmentalBERT-action ## Model Description As an extension to [this paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4622514), this is the EnvironmentalBERT-action language model. A language model that is trained to better classify action texts in the ESG domain. Using the [EnvironmentalBERT-base](https://huggingface.co/ESGBERT/EnvironmentalBERT-base) model as a starting point, the EnvironmentalBERT-action Language Model is additionally fine-trained on a dataset with 500 sentences to detect action text samples. The underlying dataset is comparatively small, so if you would like to contribute to it, feel free to reach out. For instance, you could find a set of misclassifications and send it to me. :) ## How to Get Started With the Model It is highly recommended to first classify a sentence to be "environmental" or not with the [EnvironmentalBERT-environmental](https://huggingface.co/ESGBERT/EnvironmentalBERT-environmental) model before classifying whether it is "action" or not. This intersection allows us to build a targeted insight into whether the sentence displays an "environmental action". You can use the model with a pipeline for text classification: ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline tokenizer_name = "ESGBERT/EnvironmentalBERT-action" model_name = "ESGBERT/EnvironmentalBERT-action" model = AutoModelForSequenceClassification.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, max_len=512) pipe = pipeline("text-classification", model=model, tokenizer=tokenizer) # set device=0 to use GPU # See https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.pipeline print(pipe("We are actively working to reduce our CO2 emissions by planting trees in 25 countries.", padding=True, truncation=True)) ``` ## More details to the base models can be found in this paper While this dataset does not originate from the paper, it is a extension of it and the base models are described in it. ```bibtex @article{Schimanski23ESGBERT, title={{Bridiging the Gap in ESG Measurement: Using NLP to Quantify Environmental, Social, and Governance Communication}}, author={Tobias Schimanski and Andrin Reding and Nico Reding and Julia Bingler and Mathias Kraus and Markus Leippold}, year={2023}, journal={Available on SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4622514}, } ```