Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,8 @@ repo: N.A.
|
|
13 |
|
14 |
<!-- Provide a quick summary of what the model is/does. -->
|
15 |
|
|
|
|
|
16 |
This is a binary classification model that was trained with prompt input to
|
17 |
detect whether two pieces of text were written by the same author.
|
18 |
|
@@ -23,8 +25,8 @@ This is a binary classification model that was trained with prompt input to
|
|
23 |
|
24 |
<!-- Provide a longer summary of what this model is. -->
|
25 |
|
26 |
-
This model is based
|
27 |
-
on 30K pairs of texts for authorship verification. The model is
|
28 |
To run the model, the demo code is provided in demo.ipynb submitted.
|
29 |
It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.
|
30 |
|
@@ -78,7 +80,7 @@ This model is based upon a Llama2 model that was fine-tuned
|
|
78 |
|
79 |
- trained on: V100 16GB
|
80 |
- overall training time: 59 hours
|
81 |
-
- duration per training epoch: 59
|
82 |
- model size: ~27GB
|
83 |
- LoRA adaptor size: 192 MB
|
84 |
|
@@ -125,8 +127,10 @@ The development set provided, amounting to 6K pairs.
|
|
125 |
### Software
|
126 |
|
127 |
|
128 |
-
- Transformers
|
129 |
-
- Pytorch
|
|
|
|
|
130 |
|
131 |
## Bias, Risks, and Limitations
|
132 |
|
@@ -135,10 +139,3 @@ The development set provided, amounting to 6K pairs.
|
|
135 |
Any inputs (concatenation of two sequences plus prompt words) longer than
|
136 |
4096 subwords will be truncated by the model.
|
137 |
|
138 |
-
## Additional Information
|
139 |
-
|
140 |
-
<!-- Any other information that would be useful for other people to know. -->
|
141 |
-
|
142 |
-
The hyperparameters were determined by experimentation
|
143 |
-
with different values, such that the model could succesfully train on the V100 with a gradual decrease in training loss. Since LoRA is used, the Llama2 base model must also
|
144 |
-
be loaded for the model to function, pre-trained Llama2 model access would need to be requested, access could be applied on https://huggingface.co/meta-llama.
|
|
|
13 |
|
14 |
<!-- Provide a quick summary of what the model is/does. -->
|
15 |
|
16 |
+
This model is trained as part of the coursework of COMP34812.
|
17 |
+
|
18 |
This is a binary classification model that was trained with prompt input to
|
19 |
detect whether two pieces of text were written by the same author.
|
20 |
|
|
|
25 |
|
26 |
<!-- Provide a longer summary of what this model is. -->
|
27 |
|
28 |
+
This model is based on a Llama2 model that was fine-tuned
|
29 |
+
on 30K pairs of texts for authorship verification. The model is fine-tuned with prompt inputs to utilize the model's linguistic knowledge.
|
30 |
To run the model, the demo code is provided in demo.ipynb submitted.
|
31 |
It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.
|
32 |
|
|
|
80 |
|
81 |
- trained on: V100 16GB
|
82 |
- overall training time: 59 hours
|
83 |
+
- duration per training epoch: 59 hours
|
84 |
- model size: ~27GB
|
85 |
- LoRA adaptor size: 192 MB
|
86 |
|
|
|
127 |
### Software
|
128 |
|
129 |
|
130 |
+
- Transformers
|
131 |
+
- Pytorch
|
132 |
+
- bitesandbytes
|
133 |
+
- Accelerate
|
134 |
|
135 |
## Bias, Risks, and Limitations
|
136 |
|
|
|
139 |
Any inputs (concatenation of two sequences plus prompt words) longer than
|
140 |
4096 subwords will be truncated by the model.
|
141 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|