mistral-7B-arXflix / params.json
flopsy1's picture
Initial commit of arXflix fine-tuned Mistral model with LoRA
c6bd378
raw
history blame
331 Bytes
{
"dim": 4096,
"n_layers": 32,
"head_dim": 128,
"hidden_dim": 14336,
"n_heads": 32,
"n_kv_heads": 8,
"norm_eps": 1e-05,
"vocab_size": 32768,
"rope_theta": 1000000.0,
"lora": {
"enable": true,
"rank": 64,
"dropout": 0.0,
"scaling": 2.0
},
"moe": null
}