Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
Inference Endpoints
ArkaAbacus's picture
Update README.md
2a23d55 verified
|
raw
history blame
2.32 kB
metadata
library_name: transformers
license: llama2
datasets:
  - aqua_rat
  - microsoft/orca-math-word-problems-200k
  - m-a-p/CodeFeedback-Filtered-Instruction

Llama-3-Smaug-70B-Instruct

Built with Meta Llama 3

image/png

This model was built using a new Smaug recipe for improving performance on real world multi-turn conversations applied to meta-llama/Meta-Llama-3-70B-Instruct.

The model outperforms Llama-3-70B-Instruct substantially, and is on par with GPT-4-Turbo, on MT-Bench (see below). We are conducting additional benchmark evaluations and will add those when available.

Model Description

Evaluation

MT-Bench

########## First turn ##########
                   score
model             turn
Smaug-Llama-3-70B-Instruct         1     9.40000                                                                                                                            
GPT-4-Turbo                        1     9.37500
Meta-Llama-3-70B-Instruct          1     9.21250 
########## Second turn ##########
                   score
model             turn
Smaug-Llama-3-70B-Instruct         2     9.0125
GPT-4-Turbo                        2     9.0000
Meta-Llama-3-70B-Instruct          2     8.8000
########## Average ##########
                 score
model
Smaug-Llama-3-70B-Instruct          9.206250
GPT-4-Turbo                         9.187500
Meta-Llama-3-70B-Instruct           9.006250
Model First turn Second Turn Average
Smaug-Llama-3-70B-Instruct 9.40 9.01 9.21
GPT-4-Turbo 9.38 9.00 9.19
Meta-Llama-3-70B-Instruct 9.21 8.80 9.01

This version of Smaug uses new techniques and new data compared to Smaug-72B, and more information will be released later on. For now, see the previous Smaug paper: https://arxiv.org/abs/2402.13228.