language:
- en
pipeline_tag: text-classification
widget:
- text: >-
And it was great to see how our Chinese team very much aware of that and
of shifting all the resourcing to really tap into these opportunities.
example_title: Examplary Transformation Sentence
- text: >-
But we will continue to recruit even after that because we expect that the
volumes are going to continue to grow.
example_title: Examplary Non-Transformation Sentence
- text: >-
So and again, we'll be disclosing the current taxes that are there in
Guyana, along with that revenue adjustment.
example_title: Examplary Non-Transformation Sentence
TransformationTransformer
TransformationTransformer is a fine-tuned distilroberta model. It is trained and evaluated on 10,000 manually annotated sentences gleaned from the Q&A-section of quarterly earnings conference calls. In particular, it was trained on sentences issued by firm executives to discriminate between setnences that allude to business transformation vis-à-vis those that discuss topics other than business transformations. More details about the training procedure can be found below.
Background
Context on the project.
Usage
The model is intented to be used for sentence classification: It creates a contextual text representation from the input sentence and outputs a probability value. LABEL_1
refers to a sentence that is predicted to contains transformation-related content (vice versa for LABEL_0
). The query should consist of a single sentence.
Usage (API)
import json
import requests
API_TOKEN = <TOKEN>
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/simonschoe/call2vec"
def query(payload):
data = json.dumps(payload)
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8"))
query({"inputs": "<insert-sentence-here>"})
Usage (transformers)
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("simonschoe/TransformationTransformer")
model = AutoModelForSequenceClassification.from_pretrained("simonschoe/TransformationTransformer")
classifier = pipeline('text-classification', model=model, tokenizer=tokenizer)
classifier('<insert-sentence-here>')
Model Training
The model has been trained on text data stemming from earnings call transcripts. The data is restricted to a call's question-and-answer (Q&A) section and the remarks by firm executives. The data has been segmented into individual sentences using spacy
.
Statistics of Training Data:
- Labeled sentences: 10,000
- Data distribution: xxx
- Inter-coder agreement: xxx
The following code snippets presents the training pipeline: