what is the expected effective context length for this model?

#1
by Samvanity - opened

I know the base Mistral Small has 32768 as the context length, but your training has a sequence length of 8192. Does it mean that the effective ctx for this model becomes 8192?

Thanks. I'm new to this.

Arli AI org

No the context length capability is the same as the base model.

thank you. And so how does the training data sequence length affect the performance of the model?

Arli AI org

Usually I see models follow instructions better at longer context when training sequence length is longer.

Sign up or log in to comment