metadata

language: en
model-index:
  - name: vwxyzjn/rm_zephyr_new
    results:
      - task:
          type: preference_evaluation
        dataset:
          name: reward-bench
          type: allenai/reward-bench
        metrics:
          - type: accuracy
            value: 0.5343383584589615
      - task:
          type: preference_evaluation
        dataset:
          name: Chat
          type: Chat
        metrics:
          - type: accuracy
            value: 0.8128491620111732
      - task:
          type: preference_evaluation
        dataset:
          name: Chat Hard
          type: Chat_Hard
        metrics:
          - type: accuracy
            value: 0.5263157894736842
      - task:
          type: preference_evaluation
        dataset:
          name: Safety
          type: Safety
        metrics:
          - type: accuracy
            value: 0.4851351351351351
      - task:
          type: preference_evaluation
        dataset:
          name: Reasoning
          type: Reasoning
        metrics:
          - type: accuracy
            value: 0.3930266819446718
      - task:
          type: preference_evaluation
        dataset:
          name: alpacaeval-easy
          type: alpacaeval-easy
        metrics:
          - type: accuracy
            value: 0.88
      - task:
          type: preference_evaluation
        dataset:
          name: alpacaeval-hard
          type: alpacaeval-hard
        metrics:
          - type: accuracy
            value: 0.8947368421052632
      - task:
          type: preference_evaluation
        dataset:
          name: alpacaeval-length
          type: alpacaeval-length
        metrics:
          - type: accuracy
            value: 0.6842105263157895
      - task:
          type: preference_evaluation
        dataset:
          name: donotanswer
          type: donotanswer
        metrics:
          - type: accuracy
            value: 0.34558823529411764
      - task:
          type: preference_evaluation
        dataset:
          name: hep-cpp
          type: hep-cpp
        metrics:
          - type: accuracy
            value: 0.6646341463414634
      - task:
          type: preference_evaluation
        dataset:
          name: hep-go
          type: hep-go
        metrics:
          - type: accuracy
            value: 0.6951219512195121
      - task:
          type: preference_evaluation
        dataset:
          name: hep-java
          type: hep-java
        metrics:
          - type: accuracy
            value: 0.6707317073170732
      - task:
          type: preference_evaluation
        dataset:
          name: hep-js
          type: hep-js
        metrics:
          - type: accuracy
            value: 0.676829268292683
      - task:
          type: preference_evaluation
        dataset:
          name: hep-python
          type: hep-python
        metrics:
          - type: accuracy
            value: 0.6829268292682927
      - task:
          type: preference_evaluation
        dataset:
          name: hep-rust
          type: hep-rust
        metrics:
          - type: accuracy
            value: 0.5609756097560976
      - task:
          type: preference_evaluation
        dataset:
          name: llmbar-adver-GPTInst
          type: llmbar-adver-GPTInst
        metrics:
          - type: accuracy
            value: 0.31521739130434784
      - task:
          type: preference_evaluation
        dataset:
          name: llmbar-adver-GPTOut
          type: llmbar-adver-GPTOut
        metrics:
          - type: accuracy
            value: 0.5531914893617021
      - task:
          type: preference_evaluation
        dataset:
          name: llmbar-adver-manual
          type: llmbar-adver-manual
        metrics:
          - type: accuracy
            value: 0.43478260869565216
      - task:
          type: preference_evaluation
        dataset:
          name: llmbar-adver-neighbor
          type: llmbar-adver-neighbor
        metrics:
          - type: accuracy
            value: 0.6044776119402985
      - task:
          type: preference_evaluation
        dataset:
          name: llmbar-natural
          type: llmbar-natural
        metrics:
          - type: accuracy
            value: 0.64
      - task:
          type: preference_evaluation
        dataset:
          name: math-prm
          type: math-prm
        metrics:
          - type: accuracy
            value: 0.12751677852348994
      - task:
          type: preference_evaluation
        dataset:
          name: mt-bench-easy
          type: mt-bench-easy
        metrics:
          - type: accuracy
            value: 0.7857142857142857
      - task:
          type: preference_evaluation
        dataset:
          name: mt-bench-hard
          type: mt-bench-hard
        metrics:
          - type: accuracy
            value: 0.5405405405405406
      - task:
          type: preference_evaluation
        dataset:
          name: mt-bench-med
          type: mt-bench-med
        metrics:
          - type: accuracy
            value: 0.775
      - task:
          type: preference_evaluation
        dataset:
          name: refusals-dangerous
          type: refusals-dangerous
        metrics:
          - type: accuracy
            value: 0.18
      - task:
          type: preference_evaluation
        dataset:
          name: refusals-offensive
          type: refusals-offensive
        metrics:
          - type: accuracy
            value: 0.58
      - task:
          type: preference_evaluation
        dataset:
          name: xstest-should-refuse
          type: xstest-should-refuse
        metrics:
          - type: accuracy
            value: 0.461038961038961
      - task:
          type: preference_evaluation
        dataset:
          name: xstest-should-respond
          type: xstest-should-respond
        metrics:
          - type: accuracy
            value: 0.66

Model Card for vwxyzjn/rm_zephyr_new

Model Details

Model Description

Developed by: [More Information Needed]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): en
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]