Cyrus1020 commited on
Commit
605d8a7
1 Parent(s): add42cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -96
README.md CHANGED
@@ -30,7 +30,7 @@ This model is based on a Llama2 model that was fine-tuned
30
  To run the model, the demo code is provided in demo.ipynb submitted.
31
  It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.
32
 
33
- - **Developed by:** Hei Chan and Mehedi Bari
34
  - **Language(s):** English
35
  - **Model type:** Supervised
36
  - **Model architecture:** Transformers
@@ -43,99 +43,4 @@ This model is based on a Llama2 model that was fine-tuned
43
  - **Repository:** https://huggingface.co/meta-llama/Llama-2-7b-hf
44
  - **Paper or documentation:** https://arxiv.org/abs/2307.09288
45
 
46
- ## Training Details
47
-
48
- ### Training Data
49
-
50
- <!-- This is a short stub of information on the training data that was used, and documentation related to data pre-processing or additional filtering (if applicable). -->
51
-
52
- 30K pairs of texts drawn from emails, news articles and blog posts.
53
-
54
- ### Training Procedure
55
-
56
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
57
-
58
- #### Training Hyperparameters
59
-
60
- <!-- This is a summary of the values of hyperparameters used in training the model. -->
61
-
62
-
63
- - learning_rate: 1e-05
64
- - weight decay: 0.001
65
- - train_batch_size: 2
66
- - gradient accumulation steps: 4
67
- - optimizer: paged_adamw_8bit
68
- - LoRA r: 64
69
- - LoRA alpha: 128
70
- - LoRA dropout: 0.05
71
- - RSLoRA: True
72
- - max grad norm: 0.3
73
- - eval_batch_size: 1
74
- - num_epochs: 1
75
-
76
- #### Speeds, Sizes, Times
77
-
78
- <!-- This section provides information about how roughly how long it takes to train the model and the size of the resulting model. -->
79
-
80
-
81
- - trained on: V100 16GB
82
- - overall training time: 59 hours
83
- - duration per training epoch: 59 hours
84
- - model size: ~27GB
85
- - LoRA adaptor size: 192 MB
86
-
87
- ## Evaluation
88
-
89
- <!-- This section describes the evaluation protocols and provides the results. -->
90
-
91
- ### Testing Data & Metrics
92
-
93
- #### Testing Data
94
-
95
- <!-- This should describe any evaluation data used (e.g., the development/validation set provided). -->
96
-
97
- The development set provided, amounting to 6K pairs.
98
-
99
- #### Metrics
100
-
101
- <!-- These are the evaluation metrics being used. -->
102
-
103
-
104
- - Precision
105
- - Recall
106
- - F1-score
107
- - Accuracy
108
-
109
- ### Results
110
-
111
-
112
- - Precision: 80.6%
113
- - Recall: 80.4%
114
- - F1 score: 80.3%
115
- - Accuracy: 80.4%
116
-
117
- ## Technical Specifications
118
-
119
- ### Hardware
120
-
121
-
122
- - Mode: Inference
123
- - VRAM: at least 6 GB
124
- - Storage: at least 30 GB,
125
- - GPU: RTX3060
126
-
127
- ### Software
128
-
129
-
130
- - Transformers
131
- - Pytorch
132
- - bitesandbytes
133
- - Accelerate
134
-
135
- ## Bias, Risks, and Limitations
136
-
137
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
138
-
139
- Any inputs (concatenation of two sequences plus prompt words) longer than
140
- 4096 subwords will be truncated by the model.
141
 
 
30
  To run the model, the demo code is provided in demo.ipynb submitted.
31
  It is advised to use the pre-processing and post-processing functions (provided in demo.ipynb) along with the model for best results.
32
 
33
+ - **Developed by:** Hei Chan
34
  - **Language(s):** English
35
  - **Model type:** Supervised
36
  - **Model architecture:** Transformers
 
43
  - **Repository:** https://huggingface.co/meta-llama/Llama-2-7b-hf
44
  - **Paper or documentation:** https://arxiv.org/abs/2307.09288
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46