Update README.md
Browse files
README.md
CHANGED
@@ -64,6 +64,8 @@ Good quants for reading (prompt eval speed) are BF16, F16, Q4\_0, and
|
|
64 |
Q8\_0 (ordered from fastest to slowest). Prompt evaluation is bounded by
|
65 |
computation speed (flops) so simpler quants help.
|
66 |
|
|
|
|
|
67 |
## Original README
|
68 |
|
69 |
See <https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct>
|
|
|
64 |
Q8\_0 (ordered from fastest to slowest). Prompt evaluation is bounded by
|
65 |
computation speed (flops) so simpler quants help.
|
66 |
|
67 |
+
Note: BF16 is currently only supported on CPU.
|
68 |
+
|
69 |
## Original README
|
70 |
|
71 |
See <https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct>
|