--- library_name: peft base_model: NousResearch/Meta-Llama-3-70B-Instruct license: apache-2.0 --- # Model Card for radm/Llama-3-70B-Instruct-AH-AWQ This model fine-tuned to be a judge on Arena Hard (https://github.com/lm-sys/arena-hard-auto). The base model was trained using LoRA, combined with an adapter and converted to AWQ format. Only LoRA adapter for base model can be found here (https://huggingface.co/radm/Llama-3-70B-Instruct-AH-lora) ## Model Details ### Model Description - **Developed by:** [radm] - **Model type:** [Llama-3-70b] - **Language(s) (NLP):** [English] - **License:** [apache-2.0] - **Finetuned from model [optional]:** [NousResearch/Meta-Llama-3-70B-Instruct] ## Uses [More Information Needed] ## Training Details ### Training Data Datasets: - radm/arenahard_gpt4vsllama3 - radm/truthy-dpo-v0.1-ru - jondurbin/truthy-dpo-v0.1 #### Training Hyperparameters - **Training regime:** [bf16] - **Load in 4 bit:** [True] - **Target modules:** [all] - **LoRA rank:** [16] - **Max seq length:** [8192] - **Use gradient checkpointing:** [unsloth] - **trainer:** [ORPOTrainer] - **Batch size:** [1] - **Gradient accumulation steps:** [4] - **Epochs:** [1] ### Results [More Information Needed] ## Hardware - **Hardware Type:** [Nvidia A100 80 gb] - **Hours used:** [11 hours] ### Framework versions - PEFT 0.10.0