metadata

{}

language: en license: cc-by-4.0 tags:

text-classification repo: N.A.

Model Card for llama2-promt-av-binary-lora

This model is trained as part of the coursework of COMP34812.

This is a binary classification model that was trained with prompt input to detect whether two pieces of text were written by the same author.

Model Details

Model Description

This model is based on a Llama2 model that was fine-tuned on 30K pairs of texts for authorship verification. The model is fine-tuned with prompt inputs to utilize the model's linguistic knowledge. To run the model, the demo code is provided in demo.ipynb submitted. It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.

Developed by: Hei Chan and Mehedi Bari
Language(s): English
Model type: Supervised
Model architecture: Transformers
Finetuned from model [optional]: meta-llama/Llama-2-7b-hf

Model Resources

Repository: https://huggingface.co/meta-llama/Llama-2-7b-hf
Paper or documentation: https://arxiv.org/abs/2307.09288

Training Details

Training Data

30K pairs of texts drawn from emails, news articles and blog posts.

Training Procedure

Training Hyperparameters

  - learning_rate: 1e-05
  - weight decay: 0.001
  - train_batch_size: 2
  - gradient accumulation steps: 4
  - optimizer: paged_adamw_8bit
  - LoRA r: 64
  - LoRA alpha: 128
  - LoRA dropout: 0.05
  - RSLoRA: True
  - max grad norm: 0.3
  - eval_batch_size: 1
  - num_epochs: 1

Speeds, Sizes, Times

  - trained on: V100 16GB
  - overall training time: 59 hours
  - duration per training epoch: 59 hours
  - model size: ~27GB
  - LoRA adaptor size: 192 MB

Evaluation

Testing Data & Metrics

Testing Data

The development set provided, amounting to 6K pairs.

Metrics

  - Precision
  - Recall
  - F1-score
  - Accuracy

Results

  - Precision: 80.6%
  - Recall: 80.4%
  - F1 score: 80.3%
  - Accuracy: 80.4%

Technical Specifications

Hardware

  - Mode: Inference
  - VRAM: at least 6 GB
  - Storage: at least 30 GB,
  - GPU: RTX3060

Software

  - Transformers
  - Pytorch
  - bitesandbytes
  - Accelerate

Bias, Risks, and Limitations

Any inputs (concatenation of two sequences plus prompt words) longer than 4096 subwords will be truncated by the model.