[REQUEST] Increase the available context length of Meta-Llama-3.1-70b-Instruct

#528
by SimaDude - opened

I have already spoken about this here (https://huggingface.co/spaces/huggingchat/chat-ui/discussions/372#66a069b0aeccf0fd9e012175), but I thought my message wasn't noticed by the team.

Currently the context length of Meta-Llama-3.1-70b-Instruct is cut off at 7k tokens, whereas the maximum is 128k. I understand that HuggingChat is a free service, and I do believe that setting a limit of a context length of such models is a good idea. But how come then Meta-Llama-3.1-405B-Instruct-FP8 have 14k context length? It's literally a model with 5.8x more parameters, yet you gave it more context length than Meta-Llama-3.1-70b-Instruct? Another example is c4ai-command-r-plus, which also has more parameters than Meta-Llama-3.1-70b-Instruct, but it was still given more context length (28k tokens).

My current system prompt that I'm using for RP purposes is literally 7k tokens, so I can't even feed it into Meta-Llama-3.1-70b-Instruct. I was stuck with using c4ai-command-r-plus for months, because it was the only good enough model that could get my prompt right.

Please, let me know if there are any plans to change this.

This is even the case for paid API users.
Please HF - do step it up! 8k context is not useable for most use cases.

Sign up or log in to comment