Cyrus1020
/

llama2-prompt-av-binary-lora

+---
+{}
+---
+language: en
+license: cc-by-4.0
+tags:
+- text-classification
+repo: N.A.
+---
+# Model Card for y36340hc-z89079mb-AV
+<!-- Provide a quick summary of what the model is/does. -->
+This is a binary classification model that was trained with prompt input to
+      detect whether two pieces of text were written by the same author.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This model is based upon a Llama2 model that was fine-tuned
+      on 30K pairs of texts for authorship verification. The model is trained with prompt inputs to utilize the model's linguistic knowledge.
+      To run the model, the demo code is provided in demo.ipynb submitted.
+      It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.
+- **Developed by:** Hei Chan and Mehedi Bari
+- **Language(s):** English
+- **Model type:** Supervised
+- **Model architecture:** Transformers
+- **Finetuned from model [optional]:** meta-llama/Llama-2-7b-hf
+### Model Resources
+<!-- Provide links where applicable. -->
+- **Repository:** https://huggingface.co/meta-llama/Llama-2-7b-hf
+- **Paper or documentation:** https://arxiv.org/abs/2307.09288
+## Training Details
+### Training Data
+<!-- This is a short stub of information on the training data that was used, and documentation related to data pre-processing or additional filtering (if applicable). -->
+30K pairs of texts drawn from emails, news articles and blog posts.
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Training Hyperparameters
+<!-- This is a summary of the values of hyperparameters used in training the model. -->
+      - learning_rate: 1e-05
+      - weight decay: 0.001
+      - train_batch_size: 2
+      - gradient accumulation steps: 4
+      - optimizer: paged_adamw_8bit
+      - LoRA r: 64
+      - LoRA alpha: 128
+      - LoRA dropout: 0.05
+      - RSLoRA: True
+      - max grad norm: 0.3
+      - eval_batch_size: 1
+      - num_epochs: 1
+#### Speeds, Sizes, Times
+<!-- This section provides information about how roughly how long it takes to train the model and the size of the resulting model. -->
+      - trained on: V100 16GB
+      - overall training time: 59 hours
+      - duration per training epoch: 59 minutes
+      - model size: ~27GB
+      - LoRA adaptor size: 192 MB
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data & Metrics
+#### Testing Data
+<!-- This should describe any evaluation data used (e.g., the development/validation set provided). -->
+The development set provided, amounting to 6K pairs.
+#### Metrics
+<!-- These are the evaluation metrics being used. -->
+      - Precision
+      - Recall
+      - F1-score
+      - Accuracy
+### Results
+      - Precision: 80.6%
+      - Recall: 80.4%
+      - F1 score: 80.3%
+      - Accuracy: 80.4%
+## Technical Specifications
+### Hardware
+      - Mode: Inference
+      - VRAM: at least 6 GB
+      - Storage: at least 30 GB,
+      - GPU: RTX3060
+### Software
+      - Transformers 4.18.0
+      - Pytorch 1.11.0+cu113
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+Any inputs (concatenation of two sequences plus prompt words) longer than
+      4096 subwords will be truncated by the model.
+## Additional Information
+<!-- Any other information that would be useful for other people to know. -->
+The hyperparameters were determined by experimentation
+      with different values, such that the model could succesfully train on the V100 with a gradual decrease in training loss. Since LoRA is used, the Llama2 base model must also
+      be loaded for the model to function, pre-trained Llama2 model access would need to be requested, access could be applied on https://huggingface.co/meta-llama.