Edit model card

πŸ™ GitHub β€’ πŸ‘Ύ Discord β€’ 🐀 Twitter β€’ πŸ’¬ WeChat
πŸ“ Paper β€’ πŸ’ͺ Tech Blog β€’ πŸ™Œ FAQ β€’ πŸ“— Learning Hub

Quantization Description

This repo contains GGUF quantized versions of the Yi 1.5 9B Chat model. The model is supplied in different quantizations so that you can see what works best on the hardware you would like to run it on.

The repo contains quantizations in the following types:

  • Q4_0
  • Q4_1
  • Q4_K
  • Q4_K_S
  • Q4_K_M
  • Q5_0
  • Q5_1
  • Q5_K
  • Q5_K_M
  • Q5_K_S
  • Q6_K
  • Q8_0
  • Q2_K
  • Q3_K
  • Q3_K_S
  • Q3_K_XS
  • IQ2_K
  • IQ3_S
  • IQ3_XXS
  • IQ4_NL
  • IQ4_XS
  • IQ5_K
  • IQ2_S
  • IQ2_XS
  • IQ1_S

Intro

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.

Model Context Length Pre-trained Tokens
Yi-1.5 4K, 16K, 32K 3.6T

Models

Benchmarks

  • Chat models

    Yi-1.5-34B-Chat is on par with or excels beyond larger models in most benchmarks.

    image/png

    Yi-1.5-9B-Chat is the top performer among similarly sized open-source models.

    image/png

  • Base models

    Yi-1.5-34B is on par with or excels beyond larger models in some benchmarks.

    image/png

    Yi-1.5-9B is the top performer among similarly sized open-source models.

    image/png

Quick Start

For getting up and running with Yi-1.5 models quickly, see README.

Downloads last month
90
GGUF
Model size
8.83B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API
Unable to determine this model's library. Check the docs .

Collection including thesven/Yi-1.5-9B-Chat-GGUF