---
base_model: mattshumer/Reflection-Llama-3.1-70B
library_name: transformers
license: llama3.1
pipeline_tag: text-generation
quantized_by: bartowski
---
# DO NOT DOWNLOAD
It has been rediscovered that these are again the wrong weights, this warning will go away when the proper files are up
https://x.com/mattshumer_/status/1832424499054309804?s=46
## Llamacpp imatrix Quantizations of Reflection-Llama-3.1-70B
Yes, this is with the fix to the tokenizer!
If you want to make sure it's using the thought and output tokens, be sure to enable rendering of special tokens (in llama.cpp this is the `--special` tag)
It is able to use them without rendering them, much like chat tokens, this will just let you *see* them as they're getting used by the model.
Using llama.cpp release b3658 for quantization.
Original model: https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
Run them in [LM Studio](https://lmstudio.ai/)
## Prompt format
For improved reasoning, its suggested you use this system prompt:
```
You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside tags, and then provide your final response inside