ITS NOT REAL

#15
by rombodawg - opened

40qhlp.png

Gradient AI org

Hey, @vihangsharma as mentioned in the other threads we have worked on better alignment.

@rombodawg We liked your meme!

https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-262k Let us know if you are interested in doing the same for 8B.

Hey, @vihangsharma as mentioned in the other threads we have worked on better alignment.

@rombodawg We liked your meme!

https://huggingface.co/gradientai/Llama-3-70B-Instruct-Gradient-262k Let us know if you are interested in doing the same for 8B.

I would love to see 8b have the same effectiveness at extremely high context inference as the 70b. The majority of the open source community is running modest hardware, at most a rtx 3090 with 24gb of vram, but evem thats rare, an update to the 8b-instruct model would be astounding

Much appreciate to Gradient team, thank's for this amazing model. I haven't tried to the extent over 100k tokens. But I'm actively using ±26k-±100k input including very long system prompt, exactly as Mark said about this use case. Miqu is great on handling those scenario, but it's 32k is very limiting so I have to back and forth to GPT-4o. Now Gradient 70b 262k fill the gaps and I replaced Miqu with it. Now I'm happily using Gradient's 262k to process my ±100k tokens system prompt. Gradient's legacy brings new possibilities.

Sign up or log in to comment