NeMo
okuchaiev commited on
Commit
5738a00
1 Parent(s): 70af0cf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -311,7 +311,7 @@ Evaluated using the CantTalkAboutThis Dataset as introduced in the CantTalkAbout
311
 
312
  The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
313
  - [Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
314
- - [AEGIS](https://arxiv.org/pdf/2404.05993), is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
315
  - Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
316
 
317
  ### Limitations
 
311
 
312
  The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
313
  - [Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
314
+ - AEGIS, is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
315
  - Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
316
 
317
  ### Limitations