|
--- |
|
library_name: peft |
|
base_model: NousResearch/Meta-Llama-3-70B-Instruct |
|
license: apache-2.0 |
|
--- |
|
|
|
# Model Card for radm/Llama-3-70B-Instruct-AH-AWQ |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
This model fine-tuned to be a judge on Arena Hard (https://github.com/lm-sys/arena-hard-auto). |
|
The base model was trained using LoRA, combined with an adapter and converted to AWQ format. |
|
|
|
Only LoRA adapter for base model can be found here (https://huggingface.co/radm/Llama-3-70B-Instruct-AH-lora) |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
- **Developed by:** [radm] |
|
- **Model type:** [Llama-3-70b] |
|
- **Language(s) (NLP):** [English] |
|
- **License:** [apache-2.0] |
|
- **Finetuned from model [optional]:** [NousResearch/Meta-Llama-3-70B-Instruct] |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
[More Information Needed] |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
Datasets: |
|
- radm/arenahard_gpt4vsllama3 |
|
- radm/truthy-dpo-v0.1-ru |
|
- jondurbin/truthy-dpo-v0.1 |
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** [bf16] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision --> |
|
- **Load in 4 bit:** [True] |
|
- **Target modules:** [all] |
|
- **LoRA rank:** [16] |
|
- **Max seq length:** [8192] |
|
- **Use gradient checkpointing:** [unsloth] |
|
- **trainer:** [ORPOTrainer] |
|
- **Batch size:** [1] |
|
- **Gradient accumulation steps:** [4] |
|
- **Epochs:** [1] |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
## Hardware |
|
|
|
- **Hardware Type:** [Nvidia A100 80 gb] |
|
- **Hours used:** [11 hours] |
|
|
|
### Framework versions |
|
|
|
- PEFT 0.10.0 |