File size: 1,029 Bytes
f2e8429 8f8c7e2 caaef24 8f8c7e2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
---
tags:
- deberta-v3
- deberta
- deberta-v2
license: mit
base_model:
- microsoft/deberta-v3-large
pipeline_tag: text-classification
---
# HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
[arXiv Link](https://arxiv.org/abs/2410.01524)
Our model functions as a Guard Model, intended to classify the safety of conversations with LLMs and protect against LLM jailbreak attacks.
It is fine-tuned from DeBERTa-v3-large and trained using **HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models**.
The training process involves knowledge distillation paired with data augmentation, using our [**HarmAug Generated Dataset**].
For more information, please refer to our [github](https://github.com/imnotkind/HarmAug)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/bCNW62CvDpqbXUK4eZ4-b.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/REbNDOhT31bv_XRa6-VzE.png) |