File size: 4,023 Bytes

07b3e60

---
license: apache-2.0
base_model: mistralai/Mistral-7B-Instruct-v0.1
tags:
- generated_from_trainer
model-index:
- name: mistral-instruct-adv-robust-50-sft-lora
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mistral-instruct-adv-robust-50-sft-lora

This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8817

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 16
- total_train_batch_size: 256
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 50

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 3.1318        | 0.12  | 1    | 2.8355          |
| 3.1318        | 1.12  | 2    | 2.6364          |
| 3.1318        | 2.12  | 3    | 2.4945          |
| 3.1318        | 3.12  | 4    | 2.5339          |
| 2.7386        | 4.12  | 5    | 2.3352          |
| 2.7386        | 5.12  | 6    | 2.2137          |
| 2.7386        | 6.12  | 7    | 2.1641          |
| 2.7386        | 7.12  | 8    | 2.1051          |
| 2.7386        | 8.12  | 9    | 2.0842          |
| 2.269         | 9.12  | 10   | 2.0479          |
| 2.269         | 10.12 | 11   | 1.9554          |
| 2.269         | 11.12 | 12   | 1.8555          |
| 2.269         | 12.12 | 13   | 1.7736          |
| 2.269         | 13.12 | 14   | 1.7906          |
| 1.9451        | 14.12 | 15   | 1.7737          |
| 1.9451        | 15.12 | 16   | 1.6677          |
| 1.9451        | 16.12 | 17   | 1.6411          |
| 1.9451        | 17.12 | 18   | 1.5739          |
| 1.9451        | 18.12 | 19   | 1.5334          |
| 1.6568        | 19.12 | 20   | 1.4794          |
| 1.6568        | 20.12 | 21   | 1.4008          |
| 1.6568        | 21.12 | 22   | 1.3625          |
| 1.6568        | 22.12 | 23   | 1.2964          |
| 1.6568        | 23.12 | 24   | 1.2041          |
| 1.3674        | 24.12 | 25   | 1.1971          |
| 1.3674        | 25.12 | 26   | 1.1571          |
| 1.3674        | 26.12 | 27   | 1.1080          |
| 1.3674        | 27.12 | 28   | 1.1099          |
| 1.3674        | 28.12 | 29   | 1.0930          |
| 1.145         | 29.12 | 30   | 1.0333          |
| 1.145         | 30.12 | 31   | 1.0096          |
| 1.145         | 31.12 | 32   | 1.0012          |
| 1.145         | 32.12 | 33   | 0.9266          |
| 1.145         | 33.12 | 34   | 0.9624          |
| 0.9987        | 34.12 | 35   | 0.9425          |
| 0.9987        | 35.12 | 36   | 0.9354          |
| 0.9987        | 36.12 | 37   | 0.9091          |
| 0.9987        | 37.12 | 38   | 0.9007          |
| 0.9987        | 38.12 | 39   | 0.9649          |
| 0.9071        | 39.12 | 40   | 0.9199          |
| 0.9071        | 40.12 | 41   | 0.8651          |
| 0.9071        | 41.12 | 42   | 0.8727          |
| 0.9071        | 42.12 | 43   | 0.8559          |
| 0.9071        | 43.12 | 44   | 0.8499          |
| 0.8522        | 44.12 | 45   | 0.8547          |
| 0.8522        | 45.12 | 46   | 0.8880          |
| 0.8522        | 46.12 | 47   | 0.8678          |
| 0.8522        | 47.12 | 48   | 0.8565          |
| 0.8522        | 48.12 | 49   | 0.8197          |
| 0.8153        | 49.12 | 50   | 0.8439          |


### Framework versions

- Transformers 4.35.0
- Pytorch 2.1.0a0+32f93b1
- Datasets 2.14.6
- Tokenizers 0.14.1