Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,16 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
This model was trained on a new dataset composed of available poems by Anne Bradstreet hosted by [Public Domain Poetry.](https://www.public-domain-poetry.com/anne-bradstreet) Specifically I downloaded all 40 poems and fine-tuned a bert-base-uncased text classification model on Amazon SageMaker. For the negative class, I actually generated GPT-2 samples of length 70. That is to say, for each line of Bradstreet I generated a generic GPT-2 reposes. I considered these responses my negative class.
|
5 |
+
|
6 |
+
In the classifier, I had a total of 6947 positive lines written by Anne Bradstreet, and 5219 lines generated by GPT-2 in response, totally a dataset of 12,166 labeled lines. I used only the GPT-2 responses in the training set, keeping the actual Bradstreet lines in the positive samples alone.
|
7 |
+
|
8 |
+
I split the train and test set in 80/20, leaving a total of 9732 labeled samples in training, and 2435 samples in test.
|
9 |
+
|
10 |
+
These I trained on SageMaker, using the Hugging Face deep learning container. I also used SageMaker Training Compiler, which achieved 64 samples per batch on an ml.p3.2xlarge. After 42 minutes of training, on only 5 epochs, I achieved a train loss of 0.0714. Test loss is forthcoming.
|
11 |
+
|
12 |
+
In my own tests, the model seems to be always very confident. That is to say, it routinely gives a confidence score of at least 99.8%. All predictions should be single-lines only, as this is how the model was fine-tuned. Multiple lines in a prediction request will always result in a Label0 response, ie not written by Anne Bradstreet, even if pulled directly from her works.
|
13 |
+
|
14 |
+
In short, the model seems to know the difference between generic GPT-2 text responding to a Bradstreet prompt, vs the output of a model fine-tuned on Bradstreet text and generating based on Bradstreet responses.
|
15 |
+
|
16 |
+
This was developed exclusively for use at an upcoming workshop.
|