Edit model card

LLaMA 33b finetuned on wikitext_document_level with combined linear and NTK-aware ROPE scaling (alpha=4, scale=2.) This model will be coherent up to at least 8k context length, but might work beyond that. This is a merged version of llama33b-s2a4-qlora.

Note that this is not an instruct model - this is base LLaMA with an extended sequence length.

Downloads last month: 10

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

chargoddard
/

llama33b-s2a4

Dataset used to train chargoddard/llama33b-s2a4