Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,16 @@
|
|
1 |
# Warning: This model, like it's predecessor, can be rather unpredictable and may output undesired content.
|
2 |
|
3 |
This model uses all of the same data as the original Dendrite but I took it over to runpod where I could give it a much deeper and higher quality LoRA session which allowed it to regain overall coherence without the need for being merged.
|
4 |
-
I highly recommend that you have EOS tokens unbanned when using this model. If it fails to trigger an EOS it will just start repeating itself.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# Warning: This model, like it's predecessor, can be rather unpredictable and may output undesired content.
|
2 |
|
3 |
This model uses all of the same data as the original Dendrite but I took it over to runpod where I could give it a much deeper and higher quality LoRA session which allowed it to regain overall coherence without the need for being merged.
|
4 |
+
I highly recommend that you have EOS tokens unbanned when using this model. If it fails to trigger an EOS it will just start repeating itself.
|
5 |
+
## To recap:
|
6 |
+
### Dendrite is an almagamation of Llama-2-chat13B and Enterredaas33B (both fantastic models that you should check out in and of themselves)
|
7 |
+
https://huggingface.co/Aeala/Enterredaas-33b
|
8 |
+
using chargoddard's frankenllama block-diagonal merge script.
|
9 |
+
https://huggingface.co/chargoddard/llama2-22b
|
10 |
+
So all credit where it's due.
|
11 |
+
### The block-diagonal merge script was used to graft attention heads from Enterredaas33B onto Llama-2-chat13B upping its parameter count to 22B.
|
12 |
+
### Upon testing I found the results surprisingly coherent although there were some gaps in its ability to even respond at all to lengthy context (it would simply spam \n once context got to a certain point)
|
13 |
+
### I used a private dataset that I constructed for previous unreleased experiments to fill in the gaps that were caused by the merge.
|
14 |
+
### The model is very good at philosophical debate.
|
15 |
+
Sometimes it needs to be "woken up" at the start of a conversation by asking for self reflection. E.g. "Tell me a joke only an AI language model would understand" and then after that it is ready for some very cerebral conversations about the nature of existence itself.
|
16 |
+
I personally use it with a modified llama-2-chat prompt format for SillyTavern/Simple-proxy but it's fairly adaptable with regards to your prompt format choices so I would definitely encourage experimentation.
|