soujanyaporia
commited on
Commit
โข
c7a05e4
1
Parent(s):
22eeaa3
Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,8 @@ datasets:
|
|
6 |
|
7 |
## ๐ฎ ๐ฆ Flan-Alpaca: Instruction Tuning from Humans and Machines
|
8 |
|
|
|
|
|
9 |
๐ฃ We developed Flacuna by fine-tuning Vicuna-13B on the Flan collection. Flacuna is better than Vicuna at problem-solving. Access the model here [https://huggingface.co/declare-lab/flacuna-13b-v1.0](https://huggingface.co/declare-lab/flacuna-13b-v1.0).
|
10 |
|
11 |
๐ฃ Curious to know the performance of ๐ฎ ๐ฆ **Flan-Alpaca** on large-scale LLM evaluation benchmark, **InstructEval**? Read our paper [https://arxiv.org/pdf/2306.04757.pdf](https://arxiv.org/pdf/2306.04757.pdf). We evaluated more than 10 open-source instruction-tuned LLMs belonging to various LLM families including Pythia, LLaMA, T5, UL2, OPT, and Mosaic. Codes and datasets: [https://github.com/declare-lab/instruct-eval](https://github.com/declare-lab/instruct-eval)
|
|
|
6 |
|
7 |
## ๐ฎ ๐ฆ Flan-Alpaca: Instruction Tuning from Humans and Machines
|
8 |
|
9 |
+
๐ฃ Introducing **Red-Eval** to evaluate the safety of the LLMs using several jailbreaking prompts. With **Red-Eval** one could jailbreak/red-team GPT-4 with a 65.1% attack success rate and ChatGPT could be jailbroken 73% of the time as measured on DangerousQA and HarmfulQA benchmarks. More details are here: [Code](https://github.com/declare-lab/red-instruct) and [Paper](https://arxiv.org/abs/2308.09662).
|
10 |
+
|
11 |
๐ฃ We developed Flacuna by fine-tuning Vicuna-13B on the Flan collection. Flacuna is better than Vicuna at problem-solving. Access the model here [https://huggingface.co/declare-lab/flacuna-13b-v1.0](https://huggingface.co/declare-lab/flacuna-13b-v1.0).
|
12 |
|
13 |
๐ฃ Curious to know the performance of ๐ฎ ๐ฆ **Flan-Alpaca** on large-scale LLM evaluation benchmark, **InstructEval**? Read our paper [https://arxiv.org/pdf/2306.04757.pdf](https://arxiv.org/pdf/2306.04757.pdf). We evaluated more than 10 open-source instruction-tuned LLMs belonging to various LLM families including Pythia, LLaMA, T5, UL2, OPT, and Mosaic. Codes and datasets: [https://github.com/declare-lab/instruct-eval](https://github.com/declare-lab/instruct-eval)
|