recommended max length?

#2
by brando - opened

ref: https://arxiv.org/pdf/2403.17297

based on the huge length reported 4096 should be ok, right?

ChatGPT:

Based on the report, InternLM2 was initially trained on a context length of 4096 tokens and later extended to 32k tokens during training. Given that the model was designed to handle up to 200k context windows in some scenarios, using a context length of 4096 tokens should be well within the capabilities of the InternLM-2.5-1.8B model on Hugging Face. It would likely be okay to use this context length for your purposes, considering the model's architecture and training history.

Sign up or log in to comment