Envoid commited on
Commit
148b06f
1 Parent(s): 02e889b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -1
README.md CHANGED
@@ -1,4 +1,16 @@
1
  # Warning: This model, like it's predecessor, can be rather unpredictable and may output undesired content.
2
 
3
  This model uses all of the same data as the original Dendrite but I took it over to runpod where I could give it a much deeper and higher quality LoRA session which allowed it to regain overall coherence without the need for being merged.
4
- I highly recommend that you have EOS tokens unbanned when using this model. If it fails to trigger an EOS it will just start repeating itself.
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Warning: This model, like it's predecessor, can be rather unpredictable and may output undesired content.
2
 
3
  This model uses all of the same data as the original Dendrite but I took it over to runpod where I could give it a much deeper and higher quality LoRA session which allowed it to regain overall coherence without the need for being merged.
4
+ I highly recommend that you have EOS tokens unbanned when using this model. If it fails to trigger an EOS it will just start repeating itself.
5
+ ## To recap:
6
+ ### Dendrite is an almagamation of Llama-2-chat13B and Enterredaas33B (both fantastic models that you should check out in and of themselves)
7
+ https://huggingface.co/Aeala/Enterredaas-33b
8
+ using chargoddard's frankenllama block-diagonal merge script.
9
+ https://huggingface.co/chargoddard/llama2-22b
10
+ So all credit where it's due.
11
+ ### The block-diagonal merge script was used to graft attention heads from Enterredaas33B onto Llama-2-chat13B upping its parameter count to 22B.
12
+ ### Upon testing I found the results surprisingly coherent although there were some gaps in its ability to even respond at all to lengthy context (it would simply spam \n once context got to a certain point)
13
+ ### I used a private dataset that I constructed for previous unreleased experiments to fill in the gaps that were caused by the merge.
14
+ ### The model is very good at philosophical debate.
15
+ Sometimes it needs to be "woken up" at the start of a conversation by asking for self reflection. E.g. "Tell me a joke only an AI language model would understand" and then after that it is ready for some very cerebral conversations about the nature of existence itself.
16
+ I personally use it with a modified llama-2-chat prompt format for SillyTavern/Simple-proxy but it's fairly adaptable with regards to your prompt format choices so I would definitely encourage experimentation.