Update README.md
Browse files
README.md
CHANGED
@@ -89,6 +89,10 @@ tokenizer.decode(summary_ids[0], skip_special_tokens=True, clean_up_tokenization
|
|
89 |
|
90 |
The model was pre-trained continuously on a single A10G GPU in an AWS instance for 133 hours with each epoch taking 45 hours using bf16 quantization.
|
91 |
|
|
|
|
|
|
|
|
|
92 |
#### Authors:
|
93 |
|
94 |
<a href="https://www.linkedin.com/in/bijaya-bhatta-69536018a/">Vijaya Bhatta</a>
|
|
|
89 |
|
90 |
The model was pre-trained continuously on a single A10G GPU in an AWS instance for 133 hours with each epoch taking 45 hours using bf16 quantization.
|
91 |
|
92 |
+
#### Possible Future Directions:
|
93 |
+
|
94 |
+
1. Use a decoder only model for pre-training and summarization.
|
95 |
+
|
96 |
#### Authors:
|
97 |
|
98 |
<a href="https://www.linkedin.com/in/bijaya-bhatta-69536018a/">Vijaya Bhatta</a>
|