OOzodbek commited on
Commit
81e19ac
1 Parent(s): 489ca7b

Upload logs

Browse files
Files changed (1) hide show
  1. logs.log +53 -0
logs.log ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-05-15 13:34:20,264 - INFO: Calling run..
2
+ 2024-05-15 13:34:20,264 - INFO: Problem Type: text_causal_language_modeling
3
+ 2024-05-15 13:34:20,264 - INFO: Global random seed: 916310
4
+ 2024-05-15 13:34:20,264 - INFO: Preparing the data...
5
+ 2024-05-15 13:34:20,265 - INFO: Setting up automatic validation split...
6
+ 2024-05-15 13:34:20,271 - INFO: Preparing train and validation data
7
+ 2024-05-15 13:34:20,271 - INFO: Loading train dataset...
8
+ 2024-05-15 13:34:21,196 - INFO: Stop token ids: [tensor([ 27, 91, 41681, 91, 29]), tensor([ 27, 91, 9399, 91, 29]), tensor([ 27, 91, 9125, 91, 29])]
9
+ 2024-05-15 13:34:21,210 - INFO: Loading validation dataset...
10
+ 2024-05-15 13:34:21,608 - INFO: Stop token ids: [tensor([ 27, 91, 41681, 91, 29]), tensor([ 27, 91, 9399, 91, 29]), tensor([ 27, 91, 9125, 91, 29])]
11
+ 2024-05-15 13:34:21,625 - INFO: Number of observations in train dataset: 15
12
+ 2024-05-15 13:34:21,625 - INFO: Number of observations in validation dataset: 1
13
+ 2024-05-15 13:34:22,161 - INFO: Stop token ids: [tensor([ 27, 91, 41681, 91, 29], device='cuda:0'), tensor([ 27, 91, 9399, 91, 29], device='cuda:0'), tensor([ 27, 91, 9125, 91, 29], device='cuda:0')]
14
+ 2024-05-15 13:34:22,173 - WARNING: PAD token id not matching between config and tokenizer. Overwriting with tokenizer id.
15
+ 2024-05-15 13:34:22,173 - INFO: Setting pretraining_tp of model config to 1.
16
+ 2024-05-15 13:34:22,188 - INFO: Using bfloat16 for backbone
17
+ 2024-05-15 13:34:22,188 - INFO: Loading meta-llama/Meta-Llama-3-8B. This may take a while.
18
+ 2024-05-15 13:35:50,131 - INFO: Loaded meta-llama/Meta-Llama-3-8B.
19
+ 2024-05-15 13:35:50,134 - WARNING: PAD token id not matching between generation config and tokenizer. Overwriting with tokenizer id.
20
+ 2024-05-15 13:35:50,135 - INFO: Lora module names: ['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj']
21
+ 2024-05-15 13:35:50,379 - INFO: Enough space available for saving model weights.Required space: 15817.20MB, Available space: 983440.06MB.
22
+ 2024-05-15 13:35:50,387 - INFO: Optimizer AdamW has been provided with parameters {'eps': 1e-08, 'weight_decay': 0.0, 'betas': (0.8999999762, 0.9990000129), 'lr': 0.0001}
23
+ 2024-05-15 13:35:51,788 - INFO: started process: 0, can_track: True, tracking_mode: TrackingMode.AFTER_EPOCH
24
+ 2024-05-15 13:35:51,788 - INFO: Training Epoch: 1 / 2
25
+ 2024-05-15 13:35:51,789 - INFO: train loss: 0%| | 0/3 [00:00<?, ?it/s]
26
+ 2024-05-15 13:35:51,922 - INFO: Evaluation step: 3
27
+ 2024-05-15 13:35:53,635 - INFO: train loss: 6.76: 33%|###3 | 1/3 [00:01<00:03, 1.85s/it]
28
+ 2024-05-15 13:35:54,389 - INFO: train loss: 6.14: 67%|######6 | 2/3 [00:02<00:01, 1.20s/it]
29
+ 2024-05-15 13:35:55,354 - INFO: train loss: 4.83: 100%|##########| 3/3 [00:03<00:00, 1.09s/it]
30
+ 2024-05-15 13:35:55,354 - INFO: train loss: 4.83: 100%|##########| 3/3 [00:03<00:00, 1.19s/it]
31
+ 2024-05-15 13:35:55,354 - INFO: Saving last model checkpoint to /app/output
32
+ 2024-05-15 13:35:55,354 - INFO: Saving checkpoint..
33
+ 2024-05-15 13:36:45,100 - INFO: Starting validation inference
34
+ 2024-05-15 13:36:45,100 - INFO: validation progress: 0%| | 0/1 [00:00<?, ?it/s]
35
+ 2024-05-15 13:36:45,330 - INFO: validation progress: 100%|##########| 1/1 [00:00<00:00, 4.35it/s]
36
+ 2024-05-15 13:36:45,334 - INFO: validation progress: 100%|##########| 1/1 [00:00<00:00, 4.29it/s]
37
+ 2024-05-15 13:36:45,371 - INFO: Validation Perplexity: 1.52548
38
+ 2024-05-15 13:36:45,371 - INFO: Mean validation loss: 0.42231
39
+ 2024-05-15 13:36:45,908 - INFO: Training Epoch: 2 / 2
40
+ 2024-05-15 13:36:45,908 - INFO: train loss: 0%| | 0/3 [00:00<?, ?it/s]
41
+ 2024-05-15 13:36:45,995 - INFO: Evaluation step: 3
42
+ 2024-05-15 13:36:46,947 - INFO: train loss: 0.44: 33%|###3 | 1/3 [00:01<00:02, 1.04s/it]
43
+ 2024-05-15 13:36:47,767 - INFO: train loss: 0.29: 67%|######6 | 2/3 [00:01<00:00, 1.10it/s]
44
+ 2024-05-15 13:36:49,228 - INFO: train loss: 0.23: 100%|##########| 3/3 [00:03<00:00, 1.16s/it]
45
+ 2024-05-15 13:36:49,228 - INFO: train loss: 0.23: 100%|##########| 3/3 [00:03<00:00, 1.11s/it]
46
+ 2024-05-15 13:36:49,228 - INFO: Saving last model checkpoint to /app/output
47
+ 2024-05-15 13:36:49,229 - INFO: Saving checkpoint..
48
+ 2024-05-15 13:37:33,992 - INFO: Starting validation inference
49
+ 2024-05-15 13:37:33,993 - INFO: validation progress: 0%| | 0/1 [00:00<?, ?it/s]
50
+ 2024-05-15 13:37:34,208 - INFO: validation progress: 100%|##########| 1/1 [00:00<00:00, 4.65it/s]
51
+ 2024-05-15 13:37:34,210 - INFO: validation progress: 100%|##########| 1/1 [00:00<00:00, 4.59it/s]
52
+ 2024-05-15 13:37:34,242 - INFO: Validation Perplexity: 1.07148
53
+ 2024-05-15 13:37:34,242 - INFO: Mean validation loss: 0.06904