Max context

#1
by Altotas - opened

While it definitely has a distinct feel to its writing and is capable of composing beautiful descriptions, during my testing it wasn't able to handle even 16k context. I used q5_k_m.

8676786.png

This is not unusual.
As soon as a model is fine tuned (this model contains TWO fine tunes), it can impair / affect the model's max context "comprehension" (input).

Based on experience and reading about a lot of other user experiences the way to combat some of these issues are:

1 - Rope , but with "rope" you must modify your prompt to be more specific.
2 - Flash Attention.

On the output side:

Make sure your prompt has plenty of "meat on the bone" - short prompts can produce long generations, however the model can get
to a point it has no idea what to do as it hits 2k/3k/4k +. Sometimes just a little bit more info can fix this.

Likewise, rep pen / temp make HUGE differences - too low a temp , and the model can run out of steam.
Too high a rep pen... can stifle creativity. (likewise: too low => way too wordy and small words at that).

Top_K: Raise this for more word choices, and this may help with long gen coherence.

Other parameters like new XTC will drastically alter output, especially at length.

Sign up or log in to comment