Text2Text Generation
Transformers
PyTorch
Safetensors
t5
text-generation-inference
Inference Endpoints
soujanyaporia commited on
Commit
c7a05e4
โ€ข
1 Parent(s): 22eeaa3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -6,6 +6,8 @@ datasets:
6
 
7
  ## ๐Ÿฎ ๐Ÿฆ™ Flan-Alpaca: Instruction Tuning from Humans and Machines
8
 
 
 
9
  ๐Ÿ“ฃ We developed Flacuna by fine-tuning Vicuna-13B on the Flan collection. Flacuna is better than Vicuna at problem-solving. Access the model here [https://huggingface.co/declare-lab/flacuna-13b-v1.0](https://huggingface.co/declare-lab/flacuna-13b-v1.0).
10
 
11
  ๐Ÿ“ฃ Curious to know the performance of ๐Ÿฎ ๐Ÿฆ™ **Flan-Alpaca** on large-scale LLM evaluation benchmark, **InstructEval**? Read our paper [https://arxiv.org/pdf/2306.04757.pdf](https://arxiv.org/pdf/2306.04757.pdf). We evaluated more than 10 open-source instruction-tuned LLMs belonging to various LLM families including Pythia, LLaMA, T5, UL2, OPT, and Mosaic. Codes and datasets: [https://github.com/declare-lab/instruct-eval](https://github.com/declare-lab/instruct-eval)
 
6
 
7
  ## ๐Ÿฎ ๐Ÿฆ™ Flan-Alpaca: Instruction Tuning from Humans and Machines
8
 
9
+ ๐Ÿ“ฃ Introducing **Red-Eval** to evaluate the safety of the LLMs using several jailbreaking prompts. With **Red-Eval** one could jailbreak/red-team GPT-4 with a 65.1% attack success rate and ChatGPT could be jailbroken 73% of the time as measured on DangerousQA and HarmfulQA benchmarks. More details are here: [Code](https://github.com/declare-lab/red-instruct) and [Paper](https://arxiv.org/abs/2308.09662).
10
+
11
  ๐Ÿ“ฃ We developed Flacuna by fine-tuning Vicuna-13B on the Flan collection. Flacuna is better than Vicuna at problem-solving. Access the model here [https://huggingface.co/declare-lab/flacuna-13b-v1.0](https://huggingface.co/declare-lab/flacuna-13b-v1.0).
12
 
13
  ๐Ÿ“ฃ Curious to know the performance of ๐Ÿฎ ๐Ÿฆ™ **Flan-Alpaca** on large-scale LLM evaluation benchmark, **InstructEval**? Read our paper [https://arxiv.org/pdf/2306.04757.pdf](https://arxiv.org/pdf/2306.04757.pdf). We evaluated more than 10 open-source instruction-tuned LLMs belonging to various LLM families including Pythia, LLaMA, T5, UL2, OPT, and Mosaic. Codes and datasets: [https://github.com/declare-lab/instruct-eval](https://github.com/declare-lab/instruct-eval)