--- datasets: - c-s-ale/alpaca-gpt4-data - Open-Orca/OpenOrca - Intel/orca_dpo_pairs - allenai/ultrafeedback_binarized_cleaned - HuggingFaceH4/no_robots license: cc-by-nc-4.0 language: - en library_name: ExLlamaV2 pipeline_tag: text-generation tags: - Mistral - SOLAR - Quantized Model - exl2 base_model: - rishiraj/meow --- # exl2 quants for meow This repository includes the quantized models for the [meow](https://huggingface.co/rishiraj/meow) model by [Rishiraj Acharya](https://huggingface.co/rishiraj). meow is a fine-tune of [SOLAR-10.7B-Instruct-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0) with the [no_robots](https://huggingface.co/datasets/HuggingFaceH4/no_robots) dataset. ## Current models | exl2 BPW | Model Branch | Model Size | Minimum VRAM (4096 Context) | |-|-|-|-| | 2-Bit | main | 3.28 GB | 6GB GPU | | 4-Bit | 4bit | 5.61 GB | 8GB GPU | | 5-Bit | 5bit | 6.92 GB | 10GB GPU, 8GB with swap | | 6-Bit | 6bit | 8.23 GB | 10GB GPU | | 8-Bit | 8bit | 10.84 GB | 12GB GPU | ### Note Using a 12GB Nvidia GeForce RTX 3060 I got on average around 20 tokens per second on the 8-bit quant in full 4096 context. ## Where to use There are a couple places you can use an exl2 model, here are a few: - [oobabooga's Text Gen Webui](https://github.com/oobabooga/text-generation-webui) - When using the downloader, make sure to format like this: Anthonyg5005/rishiraj-meow-10.7B-exl2**\:QuantBranch** - [tabbyAPI](https://github.com/theroyallab/tabbyAPI) - [ExUI](https://github.com/turboderp/exui) - [KoboldAI](https://github.com/henk717/KoboldAI) (Clone repo, don't use snapshot) # WARNING Model cannot be used commercially due to the Alpaca dataset license. Only use this model for research purposes or personal use.