Text Generation
PEFT
English
politics
kiddothe2b's picture
Update README.md
68d94a8 verified
metadata
license: cc-by-nc-4.0
datasets:
  - coastalcph/eu_debates
language:
  - en
library_name: peft
base_model: meta-llama/Llama-2-13b-chat-hf
pipeline_tag: text-generation
tags:
  - politics
widget:
  - text: >-
      What is your opinion on "Immigration, the role of Frontex and cooperation
      among Member States"?
extra_gated_heading: >-
  You need to share contact information to access this model. You have to
  briefly describe your affiliation and the research scope of your work. If not,
  the request will automatically be rejected. The whole purpose is to filter
  requests and avoid easily identifiable misuse by AI bots or individuals with
  malicious intents.
extra_gated_fields:
  First Name: text
  Last Name: text
  Country: country
  Affiliation: text
  Scope of Research: text
extra_gated_button_content: Submit
extra_gated_prompt: >-
  ###  EU Llama-Chat Adapters -  Terms of Use

  Please read carefully the Terms of Use, before requesting access. To grant
  access to this model, you have to accept the following Terms of Use:

  (a) This model is exclusively available for research purposes.

  (b) It is prohibited to use this model for commercial and non-commercial uses
  outside the scope of research.

  (c) It is prohibited to deploy this model publicly on the web.

  (d) The model is released without any warranties.

  (e) The model may produce content that can be considered discriminatory or
  harmful. You are only allowed to share this content with caution when
  discussing biases in your work.

Model Card (Llama-2 13B - LoRA on speeches from members of ID)

In this work (Chalkidis and Brandl, 2024), we adapt Llama Chat to speeches of the members of a euro party from the EU Debates dataset. To do so, we fine-tune the 13B Llama Chat model on the speeches using adapters, specifically Low-Rank Adaptation (LoRA) (Hu et al., 2022). Since we are interested in fine-tuning conversational (chat-based) models, we create instructions as pseudo-QA pairs, similar to Cheng et al. (2023) using a pseudo-QA template:

[INST] What is your opinion on T ? [/INST] U

where the instruction (question) is based on the title (topic) of the debate (T), e.g., "Immigration, the role of Frontex and cooperation among Member States", and U is a clean version of a speech (utterance) from an MEP affiliated with the party of interest.

We use a learning rate of 2e-4, and train for 10 epochs across all data points (speeches) from the party of interest. We set LoRa alpha (α) at 16, and the rank (r) at 8.

The model was developed with the code base in the following GitHub repository: https://github.com/coastalcph/eu-politics-llms/

This is the LoRA adapter for the model adapted to the speeches from MEPs affiliated with the Identity and Democracy Party (ID).

We have also released the following LoRA adapters:

Radar Plots

Caution / Unintended Use / Biases

The adapted models can be seen as data-driven mirrors of the parties’ ideologies, but are by no means ’perfectly’ aligned, and thus may misrepresent them. The models have been developed solely for research purposes and should not be used to generate content and share publicly. We urge the community and the public to refer to credible sources, e.g., parties’ programs, interviews, original speeches, etc., when it comes to getting political information. In some cases, models may generate text that can be considered hateful, toxic, harmful, or inappropriate. Please use them with caution.

How to use

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel


# Load the base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-13b-chat-hf")
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-13b-chat-hf",
                                                  device_map="auto",
                                                  torch_dtype=torch.float16,
                                                  attn_implementation="flash_attention_2")

# Load the LoRA adapter
model = PeftModel.from_pretrained(base_model,
                                  "coastalcph/Llama-2-13b-chat-hf-LoRA-eu-debates-id",
                                  device_map="auto")


# Build pipeline
pipeline = transformers.pipeline(
        "text-generation",
        model=model,
        tokenizer=tokenizer
        )

Citation Information

Llama meets EU: Investigating the European political spectrum through the lens of LLMs. Ilias Chalkidis and Stephanie Brandl. In the Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Mexico City, Mexico, June 16–21, 2024.

@inproceedings{chalkidis-and-brandl-eu-llama-2024,
    title = "Llama meets EU: Investigating the European political spectrum through the lens of LLMs",
    author = "Chalkidis, Ilias  and Brandl, Stephanie",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
}