2024-05-15 13:34:20,264 - INFO: Calling run.. 2024-05-15 13:34:20,264 - INFO: Problem Type: text_causal_language_modeling 2024-05-15 13:34:20,264 - INFO: Global random seed: 916310 2024-05-15 13:34:20,264 - INFO: Preparing the data... 2024-05-15 13:34:20,265 - INFO: Setting up automatic validation split... 2024-05-15 13:34:20,271 - INFO: Preparing train and validation data 2024-05-15 13:34:20,271 - INFO: Loading train dataset... 2024-05-15 13:34:21,196 - INFO: Stop token ids: [tensor([ 27, 91, 41681, 91, 29]), tensor([ 27, 91, 9399, 91, 29]), tensor([ 27, 91, 9125, 91, 29])] 2024-05-15 13:34:21,210 - INFO: Loading validation dataset... 2024-05-15 13:34:21,608 - INFO: Stop token ids: [tensor([ 27, 91, 41681, 91, 29]), tensor([ 27, 91, 9399, 91, 29]), tensor([ 27, 91, 9125, 91, 29])] 2024-05-15 13:34:21,625 - INFO: Number of observations in train dataset: 15 2024-05-15 13:34:21,625 - INFO: Number of observations in validation dataset: 1 2024-05-15 13:34:22,161 - INFO: Stop token ids: [tensor([ 27, 91, 41681, 91, 29], device='cuda:0'), tensor([ 27, 91, 9399, 91, 29], device='cuda:0'), tensor([ 27, 91, 9125, 91, 29], device='cuda:0')] 2024-05-15 13:34:22,173 - WARNING: PAD token id not matching between config and tokenizer. Overwriting with tokenizer id. 2024-05-15 13:34:22,173 - INFO: Setting pretraining_tp of model config to 1. 2024-05-15 13:34:22,188 - INFO: Using bfloat16 for backbone 2024-05-15 13:34:22,188 - INFO: Loading meta-llama/Meta-Llama-3-8B. This may take a while. 2024-05-15 13:35:50,131 - INFO: Loaded meta-llama/Meta-Llama-3-8B. 2024-05-15 13:35:50,134 - WARNING: PAD token id not matching between generation config and tokenizer. Overwriting with tokenizer id. 2024-05-15 13:35:50,135 - INFO: Lora module names: ['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'] 2024-05-15 13:35:50,379 - INFO: Enough space available for saving model weights.Required space: 15817.20MB, Available space: 983440.06MB. 2024-05-15 13:35:50,387 - INFO: Optimizer AdamW has been provided with parameters {'eps': 1e-08, 'weight_decay': 0.0, 'betas': (0.8999999762, 0.9990000129), 'lr': 0.0001} 2024-05-15 13:35:51,788 - INFO: started process: 0, can_track: True, tracking_mode: TrackingMode.AFTER_EPOCH 2024-05-15 13:35:51,788 - INFO: Training Epoch: 1 / 2 2024-05-15 13:35:51,789 - INFO: train loss: 0%| | 0/3 [00:00