Cyrus1020 commited on
Commit
add42cd
1 Parent(s): 26ee960

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -12
README.md CHANGED
@@ -13,6 +13,8 @@ repo: N.A.
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
15
 
 
 
16
  This is a binary classification model that was trained with prompt input to
17
  detect whether two pieces of text were written by the same author.
18
 
@@ -23,8 +25,8 @@ This is a binary classification model that was trained with prompt input to
23
 
24
  <!-- Provide a longer summary of what this model is. -->
25
 
26
- This model is based upon a Llama2 model that was fine-tuned
27
- on 30K pairs of texts for authorship verification. The model is trained with prompt inputs to utilize the model's linguistic knowledge.
28
  To run the model, the demo code is provided in demo.ipynb submitted.
29
  It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.
30
 
@@ -78,7 +80,7 @@ This model is based upon a Llama2 model that was fine-tuned
78
 
79
  - trained on: V100 16GB
80
  - overall training time: 59 hours
81
- - duration per training epoch: 59 minutes
82
  - model size: ~27GB
83
  - LoRA adaptor size: 192 MB
84
 
@@ -125,8 +127,10 @@ The development set provided, amounting to 6K pairs.
125
  ### Software
126
 
127
 
128
- - Transformers 4.18.0
129
- - Pytorch 1.11.0+cu113
 
 
130
 
131
  ## Bias, Risks, and Limitations
132
 
@@ -135,10 +139,3 @@ The development set provided, amounting to 6K pairs.
135
  Any inputs (concatenation of two sequences plus prompt words) longer than
136
  4096 subwords will be truncated by the model.
137
 
138
- ## Additional Information
139
-
140
- <!-- Any other information that would be useful for other people to know. -->
141
-
142
- The hyperparameters were determined by experimentation
143
- with different values, such that the model could succesfully train on the V100 with a gradual decrease in training loss. Since LoRA is used, the Llama2 base model must also
144
- be loaded for the model to function, pre-trained Llama2 model access would need to be requested, access could be applied on https://huggingface.co/meta-llama.
 
13
 
14
  <!-- Provide a quick summary of what the model is/does. -->
15
 
16
+ This model is trained as part of the coursework of COMP34812.
17
+
18
  This is a binary classification model that was trained with prompt input to
19
  detect whether two pieces of text were written by the same author.
20
 
 
25
 
26
  <!-- Provide a longer summary of what this model is. -->
27
 
28
+ This model is based on a Llama2 model that was fine-tuned
29
+ on 30K pairs of texts for authorship verification. The model is fine-tuned with prompt inputs to utilize the model's linguistic knowledge.
30
  To run the model, the demo code is provided in demo.ipynb submitted.
31
  It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.
32
 
 
80
 
81
  - trained on: V100 16GB
82
  - overall training time: 59 hours
83
+ - duration per training epoch: 59 hours
84
  - model size: ~27GB
85
  - LoRA adaptor size: 192 MB
86
 
 
127
  ### Software
128
 
129
 
130
+ - Transformers
131
+ - Pytorch
132
+ - bitesandbytes
133
+ - Accelerate
134
 
135
  ## Bias, Risks, and Limitations
136
 
 
139
  Any inputs (concatenation of two sequences plus prompt words) longer than
140
  4096 subwords will be truncated by the model.
141