The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`. 0it [00:00, ?it/s] 0it [00:00, ?it/s] /opt/conda/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations warnings.warn( 2024-07-20 09:26:20.762058: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-07-20 09:26:20.762163: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-07-20 09:26:20.880295: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Generating train split: 0 examples [00:00, ? examples/s] Generating train split: 1 examples [00:00, 4.22 examples/s] Generating train split: 1324 examples [00:00, 4978.10 examples/s] Generating train split: 3140 examples [00:00, 9582.48 examples/s] Generating train split: 4972 examples [00:00, 12490.02 examples/s] Generating train split: 6657 examples [00:00, 13893.41 examples/s] Generating train split: 8380 examples [00:00, 14943.05 examples/s] Generating train split: 10220 examples [00:00, 16012.81 examples/s] Generating train split: 12016 examples [00:00, 16608.63 examples/s] Generating train split: 13979 examples [00:01, 17325.20 examples/s] Generating train split: 15798 examples [00:01, 17584.12 examples/s] Generating train split: 17640 examples [00:01, 17831.88 examples/s] Generating train split: 19475 examples [00:01, 17984.48 examples/s] Generating train split: 21337 examples [00:01, 18172.46 examples/s] Generating train split: 23184 examples [00:01, 18258.79 examples/s] Generating train split: 25973 examples [00:01, 18362.27 examples/s] Generating train split: 28694 examples [00:01, 18279.12 examples/s] Generating train split: 31429 examples [00:01, 18261.90 examples/s] Generating train split: 34127 examples [00:02, 18169.80 examples/s] Generating train split: 36000 examples [00:02, 18142.12 examples/s] Generating train split: 37873 examples [00:02, 18287.42 examples/s] Generating train split: 40568 examples [00:02, 18173.52 examples/s] Generating train split: 43258 examples [00:02, 18091.24 examples/s] Generating train split: 46000 examples [00:02, 18023.71 examples/s] Generating train split: 47882 examples [00:02, 18207.99 examples/s] Generating train split: 50532 examples [00:03, 18023.31 examples/s] Generating train split: 53111 examples [00:03, 17753.51 examples/s] Generating train split: 54978 examples [00:03, 17953.97 examples/s] Generating train split: 57593 examples [00:03, 17776.00 examples/s] Generating train split: 60251 examples [00:03, 17753.64 examples/s] Generating train split: 62033 examples [00:03, 17766.17 examples/s] Generating train split: 64000 examples [00:03, 17939.48 examples/s] Generating train split: 65913 examples [00:03, 18248.18 examples/s] Generating train split: 68281 examples [00:04, 17360.52 examples/s] Generating train split: 70706 examples [00:04, 16953.39 examples/s] Generating train split: 72506 examples [00:04, 17204.80 examples/s] Generating train split: 74290 examples [00:04, 17365.97 examples/s] Generating train split: 76075 examples [00:04, 17495.32 examples/s] Generating train split: 78681 examples [00:04, 17446.57 examples/s] Generating train split: 81277 examples [00:04, 17397.45 examples/s] Generating train split: 83044 examples [00:04, 17462.28 examples/s] Generating train split: 84989 examples [00:05, 17815.21 examples/s] Generating train split: 87646 examples [00:05, 17778.33 examples/s] Generating train split: 90307 examples [00:05, 17762.51 examples/s] Generating train split: 92867 examples [00:05, 17422.20 examples/s] Generating train split: 92867 examples [00:05, 16981.50 examples/s] Running tokenizer on train dataset: 0%| | 0/92867 [00:00