johnhandleyd commited on
Commit
575c2f3
1 Parent(s): a56d003

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -2,35 +2,30 @@
2
  license: mit
3
  base_model: TheBloke/zephyr-7B-alpha-GPTQ
4
  tags:
5
- - trl
6
- - sft
7
  - generated_from_trainer
8
- - peft
9
- - gptq
10
  model-index:
11
- - name: thesa
12
  results: []
13
- language:
14
- - en
15
- datasets:
16
- - loaiabdalslam/counselchat
17
- pipeline_tag: text-generation
18
- widget:
19
- - text: "<|system|>You are a therapist helping patients.<|user|>I'm fighting with my boyfriend and he's not talking to me. I don't know what to do<|assistant|>"
20
- example_title: "Example 1"
21
  ---
22
 
23
- # Thesa
 
24
 
25
- Thesa is an experimental project of a therapy chatbot trained on mental health data and fine-tuned with the Zephyr GPTQ model that uses quantization to decrease high computatinal and storage costs.
 
 
26
 
27
  ## Model description
28
 
29
- - Fine-tuned from [TheBloke/zephyr-7B-alpha-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-alpha-GPTQ)
30
 
31
  ## Intended uses & limitations
32
 
33
- The intended use is experimental.
 
 
 
 
34
 
35
  ## Training procedure
36
 
@@ -43,16 +38,13 @@ The following hyperparameters were used during training:
43
  - seed: 42
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: cosine
46
- - training_steps: 250
 
47
  - mixed_precision_training: Native AMP
48
 
49
-
50
  ### Framework versions
51
 
52
  - Transformers 4.35.2
53
  - Pytorch 2.1.0+cu121
54
  - Datasets 2.16.1
55
  - Tokenizers 0.15.1
56
-
57
- ## More info
58
- More info at https://github.com/johnhandleyd/thesa
 
2
  license: mit
3
  base_model: TheBloke/zephyr-7B-alpha-GPTQ
4
  tags:
 
 
5
  - generated_from_trainer
 
 
6
  model-index:
7
+ - name: thesa_v1
8
  results: []
 
 
 
 
 
 
 
 
9
  ---
10
 
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
 
14
+ # thesa_v1
15
+
16
+ This model is a fine-tuned version of [TheBloke/zephyr-7B-alpha-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-alpha-GPTQ) on an unknown dataset.
17
 
18
  ## Model description
19
 
20
+ More information needed
21
 
22
  ## Intended uses & limitations
23
 
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
+
28
+ More information needed
29
 
30
  ## Training procedure
31
 
 
38
  - seed: 42
39
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
  - lr_scheduler_type: cosine
41
+ - lr_scheduler_warmup_ratio: 0.1
42
+ - num_epochs: 10
43
  - mixed_precision_training: Native AMP
44
 
 
45
  ### Framework versions
46
 
47
  - Transformers 4.35.2
48
  - Pytorch 2.1.0+cu121
49
  - Datasets 2.16.1
50
  - Tokenizers 0.15.1
 
 
 
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:53b56432ae401224e5ff9ca98ac000d12536a457865b1e1445e58d48278ba023
3
- size 27280152
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:870a387ece9708a79136efdea27fc4c3c8727e7941ac6356bd8c10040c274f00
3
+ size 133
runs/Feb26_23-15-54_7d7bf0a59e1f/events.out.tfevents.1708989426.7d7bf0a59e1f.1627.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb8e7b8d1fb0d7d44d65874cb4ce7718031a4b244603811b6ae4550772c1c410
3
+ size 130
special_tokens_map.json CHANGED
@@ -18,13 +18,7 @@
18
  "rstrip": false,
19
  "single_word": false
20
  },
21
- "pad_token": {
22
- "content": "</s>",
23
- "lstrip": false,
24
- "normalized": false,
25
- "rstrip": false,
26
- "single_word": false
27
- },
28
  "unk_token": {
29
  "content": "<unk>",
30
  "lstrip": false,
 
18
  "rstrip": false,
19
  "single_word": false
20
  },
21
+ "pad_token": "</s>",
 
 
 
 
 
 
22
  "unk_token": {
23
  "content": "<unk>",
24
  "lstrip": false,
tokenizer.json CHANGED
@@ -2,7 +2,7 @@
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Left",
5
- "max_length": 1024,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
 
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Left",
5
+ "max_length": 512,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
tokenizer.model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055
3
- size 493443
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3daefa6fd9ee26430a71ad6009f05c4c4ec086746b2dcc3d04649f631d3654f
3
+ size 131
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0940395a2687650e17943742dd9d4f2da52e06e88efc1b5f2092954bad39de8c
3
- size 4600
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc72676d9cc9796870f3d2f933bc9ecbe7e0ed5651fb64662437296f3d056120
3
+ size 129