Edit model card

This is Llama2-22b by chargoddard in a couple of GGML formats. I have no idea what I'm doing so if something doesn't work as it should or not at all that's likely on me, not the models themselves.
A second model merge has been released and the GGML conversions for that can be found here.

While I haven't had any issues so far do note that the original repo states "Not intended for use as-is - this model is meant to serve as a base for further tuning".

Approximate VRAM requirements at 4K context:

MODEL SIZE VRAM
q5_1 16.4GB 21.5GB
q4_K_M 13.2GB 18.3GB
q3_K_M 10.6GB 16.1GB
q2_K 9.2GB 14.5GB
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Dataset used to train IHaveNoClueAndIMustPost/Llama-2-22B-GGML