SerialKicked commited on
Commit
74a3a5a
1 Parent(s): 966f39f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -18,7 +18,8 @@ Simply put, I'm making my methodology to evaluate RP models public. While none o
18
  - All models are loaded in Q8_0 (GGUF) with all layers on the GPU (NVidia RTX3060 12GB)
19
  - Backend is the latest version of KoboldCPP for Windows using CUDA 12.
20
  - Using **CuBLAS** but **not using QuantMatMul (mmq)**.
21
- - All models are extended to **16K context length** (auto rope from KCPP) with **Flash Attention** and **ContextShift** enabled.
 
22
  - Frontend is staging version of Silly Tavern.
23
  - Response size set to 1024 tokens max.
24
  - Fixed Seed for all tests: **123**
 
18
  - All models are loaded in Q8_0 (GGUF) with all layers on the GPU (NVidia RTX3060 12GB)
19
  - Backend is the latest version of KoboldCPP for Windows using CUDA 12.
20
  - Using **CuBLAS** but **not using QuantMatMul (mmq)**.
21
+ - All models are extended to **16K context length** (auto rope from KCPP)
22
+ - **Flash Attention** and **ContextShift** enabled.
23
  - Frontend is staging version of Silly Tavern.
24
  - Response size set to 1024 tokens max.
25
  - Fixed Seed for all tests: **123**