Model Card for Model ID

This model is fine-tuned for topic classification and uses the labels provided by the Comparative Agendas project. It can be used for the downstream task of classyfing Telegram Posts into 23 policy areas. It is similar to partypress/partypress-multilingual, however, its base model is FacebookAI/xlm-roberta-large and it was fine-tuned on more data and different data sources.

Model Details

Model Description

This model is based on FacebookAI/xlm-roberta-large and was trained in a three-step process. In the first step a dataset of press releases was weakly labeled with GPT-4o and the model was trained on the data. In a second step, it was fine-tuned again with GPT-4o labeled data but this time the dataset was drawn from Telegram. In a third step, it was trained on the same human annotated dataset as partypress/partypress-multilingual. The weak pre-training led to improved results on Telegram data.

Bias, Risks, and Limitations

[More Information Needed]

How to Get Started with the Model

Use the code below to get started with the model.

>>> from transformers import pipeline

>>> texts = ['Neue Anschuldigungen gegen die russischen Angriffstruppen in der Ukraine: Laut den USA sollen diese Chlorpikrin als Kampfstoff verwendet haben. Das sei ein Verstoß gegen die Chemiewaffenkonvention. /',
            'Tiktok ist ja eine chinesische App. Bestimmt wird bald über eine Tonaufnahme diskutiert, die der tschechische Geheimdienst aufgezeichnet hat: Krah am Telefon mit einem chinesischen Tech-Entwickler im TikTok-Business, der den Algorithmus extra zu Gunsten der AfD manipuliert.',
            'Saubere Bluttransfusion ,dem normalen Menschen ,ist die eigen Blut Spende nicht mehr erlaubt bzw gibt es wiedermal die Empfehlung von hochrangiger Stelle an Blutspendedienste und Krankenhäuser dieses nicht zu ermöglichen.In gehobenen Kreisen sind private Dienstleister in dieser Nische sehr aktiv.']

>>> tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512}
>>> partypress_telegram = pipeline("text-classification", model = "Sami92/XLM-R-Large-PartyPress-Telegram", tokenizer = "Sami92/XLM-R-Large-PartyPress-Telegram", **tokenizer_kwargs)

>>> partypress_telegram(texts)

Training Details

Training Data

The model was trained on three datasets, each based on the data from partypress/partypress-multilingual. The first dataset was weakly labeled using GPT-4o. The prompt contained the label description taken from Erfort et al. (2023). The weakly labeled dataset contains 32,060 press releases. The second dataset was drawn from Telegram channels. More specifically a sample from about 200 channels that have been subject to a fact-check from either Correctiv, dpa, Faktenfuchs or AFP. 7741 posts were sampled and weakly annotated by GPT-4o with the same prompt as before. The third dataset is the human-annotated dataset that is used for training partypress/partypress-multilingual. For training only the single-coded examples were used (24,117).

Training Hyperparameters

Epochs: 10
Batch size: 16
learning_rate: 2e-5
weight_decay: 0.01
fp16: True

Evaluation

Testing Data

The model was evaluated on two datasets. The first are the press releases that are annotated by two human coders per example (3,121). It is the same test data as for the Sami92/XLM-R-Large-PartyPress. For testing on Telegram data, a sample of 84 posts was taken and labeled by the model. Three annotators were then asked if the prediction of the model is either a main topic of the post, a subtopic, or incorrect. The majority vote was used as final label.

For testing on the first dataset, consisting of press releases, the F1-score reduced from 0.72 to 0.62 compared to Sami92/XLM-R-Large-PartyPress.

For the second test, there is an improvement. The detailed results can be found below. For 93% of the Telegram posts, the model prediction was either a main or subtopic. For Sami92/XLM-R-Large-PartyPress it was only in 88% of the cases a main or subtopic. The improvement is even more visible when focusing on main topics only. For the Telegram-fine-tuned model the prediction is a main topic in 82% of the cases and for the model without training on Telegram data it is 75%.

Results

Label	Percentage	Count
Maintopic	0.82	69
Subtopic	0.11	9
Incorrect	0.07	6

Category	Significance	Proportion
Agriculture	Maintopic	1.00
Civil Rights	Incorrect	0.75
	Maintopic	0.25
Culture	Maintopic	1.00
Defense	Maintopic	1.00
Domestic Commerce	Maintopic	0.50
	Incorrect	0.50
Education	Maintopic	0.67
	Subtopic	0.33
Energy	Maintopic	1.00
Environment	Maintopic	1.00
European Union	Maintopic	1.00
Foreign Trade	Maintopic	0.75
	Subtopic	0.25
Government Operations	Maintopic	0.83
	Incorrect	0.17
Health	Maintopic	1.00
Housing	Maintopic	1.00
Immigration	Maintopic	1.00
International Affairs	Maintopic	1.00
Labor	Maintopic	0.50
	Subtopic	0.50
Law and Crime	Maintopic	1.00
Macroeconomics	Maintopic	0.67
	Incorrect	0.17
	Subtopic	0.17
Other	Maintopic	1.00
Technology	Maintopic	1.00
Transportation	Subtopic	0.80
	Maintopic	0.20

Acknowledgements

I thank Cornelius Erfort for making the annotated press releases available.