recommended max length?
#2
by
brando
- opened
ref: https://arxiv.org/pdf/2403.17297
based on the huge length reported 4096 should be ok, right?
ChatGPT:
Based on the report, InternLM2 was initially trained on a context length of 4096 tokens and later extended to 32k tokens during training. Given that the model was designed to handle up to 200k context windows in some scenarios, using a context length of 4096 tokens should be well within the capabilities of the InternLM-2.5-1.8B model on Hugging Face. It would likely be okay to use this context length for your purposes, considering the model's architecture and training history.