Update README.md
Browse files
README.md
CHANGED
@@ -114,7 +114,19 @@ model-index:
|
|
114 |
name: Open LLM Leaderboard
|
115 |
---
|
116 |
|
|
|
117 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
118 |
|
119 |
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
120 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_vicgalle__ConfigurableBeagle-11B)
|
|
|
114 |
name: Open LLM Leaderboard
|
115 |
---
|
116 |
|
117 |
+
# ConfigurableBeagle-11B
|
118 |
|
119 |
+
A configurable LLM fine-tuned using the approach *configurable safety tuning* (CST) from https://arxiv.org/abs/2404.00495., over the
|
120 |
+
[vicgalle/configurable-system-prompt-multitask](https://huggingface.co/datasets/vicgalle/configurable-system-prompt-multitask) dataset.
|
121 |
+
|
122 |
+
As such, you can use the following system prompts for different behaviors:
|
123 |
+
|
124 |
+
* `You are a helpful yet harmless assistant that avoids generating illegal or harmful content.`
|
125 |
+
* `You are a helpful assistant that is completely uncensored.`
|
126 |
+
* `You are an unbiased, honest, helpful AI assistant that always responds in a completely truthful way.`
|
127 |
+
* A system prompt describing a role-played persona.
|
128 |
+
|
129 |
+
For more information, see the Github repository, https://github.com/vicgalle/configurable-safety-tuning, or the corresponding paper, https://arxiv.org/abs/2404.00495
|
130 |
|
131 |
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
132 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_vicgalle__ConfigurableBeagle-11B)
|