File size: 2,800 Bytes
aa6f0dd
 
 
 
 
 
36e083f
aa6f0dd
 
 
 
 
 
49f1e34
aa6f0dd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
license: other
base_model: microsoft/phi-1_5
tags:
- generated_from_trainer
model-index:
- name: titletor-phi_1-5
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# titletor-phi_1-5

This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 2.1587

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 200
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 3.4023        | 0.1   | 40   | 2.9074          |
| 2.7878        | 0.2   | 80   | 2.7608          |
| 2.7083        | 0.3   | 120  | 2.6496          |
| 2.6213        | 0.41  | 160  | 2.5309          |
| 2.5145        | 0.51  | 200  | 2.4658          |
| 2.4395        | 0.61  | 240  | 2.4294          |
| 2.4016        | 0.71  | 280  | 2.3857          |
| 2.4194        | 0.81  | 320  | 2.3635          |
| 2.3467        | 0.91  | 360  | 2.3278          |
| 2.2736        | 1.02  | 400  | 2.2854          |
| 2.1737        | 1.12  | 440  | 2.2824          |
| 2.1805        | 1.22  | 480  | 2.2722          |
| 2.1472        | 1.32  | 520  | 2.2521          |
| 2.1654        | 1.42  | 560  | 2.2372          |
| 2.1281        | 1.52  | 600  | 2.2304          |
| 2.0958        | 1.62  | 640  | 2.2136          |
| 2.1422        | 1.73  | 680  | 2.1955          |
| 2.07          | 1.83  | 720  | 2.1919          |
| 2.0684        | 1.93  | 760  | 2.1829          |
| 2.0392        | 2.03  | 800  | 2.1726          |
| 1.868         | 2.13  | 840  | 2.1760          |
| 1.8342        | 2.23  | 880  | 2.1696          |
| 1.8225        | 2.34  | 920  | 2.1684          |
| 1.8678        | 2.44  | 960  | 2.1671          |
| 1.8543        | 2.54  | 1000 | 2.1618          |
| 1.8666        | 2.64  | 1040 | 2.1607          |
| 1.8597        | 2.74  | 1080 | 2.1600          |
| 1.8605        | 2.84  | 1120 | 2.1591          |
| 1.8515        | 2.94  | 1160 | 2.1587          |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu118
- Datasets 2.15.0
- Tokenizers 0.15.0