README.md · grimulkan/lzlv-longLORA-70b-rope8-32k-fp16 at main

metadata

license: cc-by-nc-2.0

This is a merge of LongAlpaca-70B-lora into lizpreciatior's lzlv_70b_fp16_hf, and removing the extra row and pad token so that the vocabularies match.

There is no additional fine-tuning. The resulting model seems to not be broken... you can test whether it is truly the original model + 32K capability (use linear rope scaling 8).

ChuckMcSneed did a benchmark here, indicating 30% degradation with 8x the context length.

You could also try merging this with other models of longLORA descendency (like Aurelian).

A 6-bit EXL2 quantization is available here, and 4 -bit EXL2 here.

See this discussion for how to create merges like these.