Context Length Config
#12
by
nicklikets
- opened
In the README.md
it states that the model has a context length of 128k, yet in the config.json
it states "max_position_embeddings": 8192
. How comes the maximum positional embeddings isn't configured to ~128k?
This implementation is based on the Llama implementation which materializes this huge buffer which would not be feasible for 128k context. The model does support 128k context with a better implementation.
causal_mask = torch.full( (config.max_position_embeddings, config.max_position_embeddings), fill_value=True, dtype=torch.bool )
Maintaining context length defaults low enough to prevent end users from experiencing OOM right out of the box is generally accepted as an unwritten rule in the HF community.
nicklikets
changed discussion status to
closed