metadata
base_model: nbeerbower/flammen17-py-DPO-v1-7B
datasets:
- jondurbin/py-dpo-v0.1
inference: false
library_name: transformers
license: apache-2.0
merged_models:
- nbeerbower/flammen17-mistral-7B
pipeline_tag: text-generation
quantized_by: Suparious
tags:
- 4-bit
- AWQ
- text-generation
- autotrain_compatible
- endpoints_compatible
- experimental
nbeerbower/flammen17-py-DPO-v1-7B AWQ
- Model creator: nbeerbower
- Original model: flammen17-py-DPO-v1-7B
Model Summary
A Mistral 7B LLM built from merging pretrained models and finetuning on Jon Durbin's py-dpo-v0.1.
Finetuned using an A100 on Google Colab. 🙏
Fine-tune a Mistral-7b model with Direct Preference Optimization - Maxime Labonne