---
model-index:
- name: notus-7b-v1-lora-adapter
results: []
datasets:
- argilla/ultrafeedback-binarized-avg-rating-for-dpo
language:
- en
base_model: alignment-handbook/zephyr-7b-sft-full
library_name: peft
pipeline_tag: text-generation
tags:
- dpo
- preference
- ultrafeedback
license: apache-2.0
---
# Model Card for Notus 7B v1 (LoRA Adapters)
Notus is going to be a collection of fine-tuned models using DPO, similarly to Zephyr, but mainly focused
on the Direct Preference Optimization (DPO) step, aiming to incorporate preference feedback into the LLMs
when fine-tuning those. Notus models are intended to be used as assistants via chat-like applications, and
are evaluated with the MT-Bench, AlpacaEval, and LM Evaluation Harness benchmarks, to be directly compared
with Zephyr fine-tuned models also using DPO.
## Model Details
### Model Description
- **Developed by:** Argilla, Inc. (based on HuggingFace H4 and MistralAI previous efforts and amazing work)
- **Shared by:** Argilla, Inc.
- **Model type:** GPT-like 7B model DPO fine-tuned using LoRA
- **Language(s) (NLP):** Mainly English
- **License:** Apache 2.0 (same as Zephyr 7B SFT and Mistral 7B v0.1)
- **Finetuned from model:** [`alignment-handbook/zephyr-7b-sft-full`](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full)
### Model Sources [optional]
- **Repository:** https://github.com/argilla-io/notus-7b
- **Paper:** N/A
- **Demo:** https://argilla-notus-chat-ui.hf.space/
## Usage
As the current model only contains the adapters, you will need to use PEFT to merge the adapters into the original model first.