[hops] 2024-09-24 17:10:50.083 | INFO | Initializing a parser from /workspace/configs/exp_camembertv2/camembertv2_base_p2_17k_last_layer.yaml [hops] 2024-09-24 17:10:50.136 | INFO | Generating a FastText model from the treebank [hops] 2024-09-24 17:10:50.154 | INFO | Training fasttext model [hops] 2024-09-24 17:10:51.461 | WARNING | Some weights of RobertaModel were not initialized from the model checkpoint at /scratch/camembertv2/runs/models/camembertv2-base-bf16/post/ckpt-p2-17000/pt/ and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [hops] 2024-09-24 17:10:57.872 | INFO | Start training on cuda:3 [hops] 2024-09-24 17:10:57.876 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. [hops] 2024-09-24 17:11:12.533 | INFO | Epoch 0: train loss 2.8127 dev loss 2.0313 dev tag acc 38.38% dev head acc 26.47% dev deprel acc 52.06% [hops] 2024-09-24 17:11:12.534 | INFO | New best model: head accuracy 26.47% > 0.00% [hops] 2024-09-24 17:11:30.203 | INFO | Epoch 1: train loss 1.6494 dev loss 1.1043 dev tag acc 67.79% dev head acc 58.24% dev deprel acc 77.43% [hops] 2024-09-24 17:11:30.203 | INFO | New best model: head accuracy 58.24% > 26.47% [hops] 2024-09-24 17:11:47.950 | INFO | Epoch 2: train loss 1.0207 dev loss 0.6941 dev tag acc 79.40% dev head acc 77.31% dev deprel acc 84.67% [hops] 2024-09-24 17:11:47.951 | INFO | New best model: head accuracy 77.31% > 58.24% [hops] 2024-09-24 17:12:04.965 | INFO | Epoch 3: train loss 0.7013 dev loss 0.4977 dev tag acc 87.37% dev head acc 83.79% dev deprel acc 89.06% [hops] 2024-09-24 17:12:04.966 | INFO | New best model: head accuracy 83.79% > 77.31% [hops] 2024-09-24 17:12:21.325 | INFO | Epoch 4: train loss 0.5227 dev loss 0.3940 dev tag acc 90.34% dev head acc 85.27% dev deprel acc 91.90% [hops] 2024-09-24 17:12:21.326 | INFO | New best model: head accuracy 85.27% > 83.79% [hops] 2024-09-24 17:12:38.042 | INFO | Epoch 5: train loss 0.4107 dev loss 0.3263 dev tag acc 93.43% dev head acc 87.56% dev deprel acc 93.04% [hops] 2024-09-24 17:12:38.043 | INFO | New best model: head accuracy 87.56% > 85.27% [hops] 2024-09-24 17:12:55.903 | INFO | Epoch 6: train loss 0.3357 dev loss 0.2794 dev tag acc 95.46% dev head acc 88.97% dev deprel acc 93.72% [hops] 2024-09-24 17:12:55.904 | INFO | New best model: head accuracy 88.97% > 87.56% [hops] 2024-09-24 17:13:13.763 | INFO | Epoch 7: train loss 0.2768 dev loss 0.2410 dev tag acc 96.72% dev head acc 90.81% dev deprel acc 94.62% [hops] 2024-09-24 17:13:13.764 | INFO | New best model: head accuracy 90.81% > 88.97% [hops] 2024-09-24 17:13:31.249 | INFO | Epoch 8: train loss 0.2339 dev loss 0.2229 dev tag acc 97.78% dev head acc 91.26% dev deprel acc 95.34% [hops] 2024-09-24 17:13:31.250 | INFO | New best model: head accuracy 91.26% > 90.81% [hops] 2024-09-24 17:13:48.426 | INFO | Epoch 9: train loss 0.1999 dev loss 0.2153 dev tag acc 98.06% dev head acc 91.40% dev deprel acc 95.68% [hops] 2024-09-24 17:13:48.427 | INFO | New best model: head accuracy 91.40% > 91.26% [hops] 2024-09-24 17:14:05.318 | INFO | Epoch 10: train loss 0.1755 dev loss 0.2009 dev tag acc 98.31% dev head acc 92.71% dev deprel acc 96.05% [hops] 2024-09-24 17:14:05.320 | INFO | New best model: head accuracy 92.71% > 91.40% [hops] 2024-09-24 17:14:22.152 | INFO | Epoch 11: train loss 0.1523 dev loss 0.1893 dev tag acc 98.35% dev head acc 93.49% dev deprel acc 96.03% [hops] 2024-09-24 17:14:22.153 | INFO | New best model: head accuracy 93.49% > 92.71% [hops] 2024-09-24 17:14:39.729 | INFO | Epoch 12: train loss 0.1368 dev loss 0.1846 dev tag acc 98.52% dev head acc 93.65% dev deprel acc 96.35% [hops] 2024-09-24 17:14:39.730 | INFO | New best model: head accuracy 93.65% > 93.49% [hops] 2024-09-24 17:14:57.785 | INFO | Epoch 13: train loss 0.1220 dev loss 0.2027 dev tag acc 98.63% dev head acc 93.91% dev deprel acc 96.43% [hops] 2024-09-24 17:14:57.786 | INFO | New best model: head accuracy 93.91% > 93.65% [hops] 2024-09-24 17:15:15.740 | INFO | Epoch 14: train loss 0.1122 dev loss 0.1918 dev tag acc 98.79% dev head acc 94.04% dev deprel acc 96.89% [hops] 2024-09-24 17:15:15.741 | INFO | New best model: head accuracy 94.04% > 93.91% [hops] 2024-09-24 17:15:33.756 | INFO | Epoch 15: train loss 0.1010 dev loss 0.1851 dev tag acc 98.90% dev head acc 93.87% dev deprel acc 97.01% [hops] 2024-09-24 17:15:49.185 | INFO | Epoch 16: train loss 0.0943 dev loss 0.1964 dev tag acc 98.97% dev head acc 94.26% dev deprel acc 97.05% [hops] 2024-09-24 17:15:49.186 | INFO | New best model: head accuracy 94.26% > 94.04% [hops] 2024-09-24 17:16:05.715 | INFO | Epoch 17: train loss 0.0822 dev loss 0.1925 dev tag acc 98.84% dev head acc 95.16% dev deprel acc 97.18% [hops] 2024-09-24 17:16:05.716 | INFO | New best model: head accuracy 95.16% > 94.26% [hops] 2024-09-24 17:16:23.327 | INFO | Epoch 18: train loss 0.0783 dev loss 0.1929 dev tag acc 99.02% dev head acc 94.98% dev deprel acc 97.26% [hops] 2024-09-24 17:16:38.640 | INFO | Epoch 19: train loss 0.0720 dev loss 0.1976 dev tag acc 99.09% dev head acc 95.05% dev deprel acc 97.11% [hops] 2024-09-24 17:16:53.616 | INFO | Epoch 20: train loss 0.0644 dev loss 0.1988 dev tag acc 99.10% dev head acc 95.09% dev deprel acc 97.28% [hops] 2024-09-24 17:17:09.018 | INFO | Epoch 21: train loss 0.0609 dev loss 0.2084 dev tag acc 99.14% dev head acc 95.39% dev deprel acc 97.28% [hops] 2024-09-24 17:17:09.019 | INFO | New best model: head accuracy 95.39% > 95.16% [hops] 2024-09-24 17:17:26.308 | INFO | Epoch 22: train loss 0.0585 dev loss 0.2076 dev tag acc 99.19% dev head acc 95.35% dev deprel acc 97.58% [hops] 2024-09-24 17:17:41.545 | INFO | Epoch 23: train loss 0.0545 dev loss 0.2094 dev tag acc 99.15% dev head acc 95.29% dev deprel acc 97.49% [hops] 2024-09-24 17:17:56.873 | INFO | Epoch 24: train loss 0.0502 dev loss 0.2116 dev tag acc 99.17% dev head acc 95.23% dev deprel acc 97.49% [hops] 2024-09-24 17:18:12.142 | INFO | Epoch 25: train loss 0.0455 dev loss 0.2059 dev tag acc 99.23% dev head acc 95.44% dev deprel acc 97.55% [hops] 2024-09-24 17:18:12.143 | INFO | New best model: head accuracy 95.44% > 95.39% [hops] 2024-09-24 17:18:29.698 | INFO | Epoch 26: train loss 0.0436 dev loss 0.2258 dev tag acc 99.22% dev head acc 95.46% dev deprel acc 97.44% [hops] 2024-09-24 17:18:29.699 | INFO | New best model: head accuracy 95.46% > 95.44% [hops] 2024-09-24 17:18:45.933 | INFO | Epoch 27: train loss 0.0404 dev loss 0.2359 dev tag acc 99.20% dev head acc 95.56% dev deprel acc 97.47% [hops] 2024-09-24 17:18:45.934 | INFO | New best model: head accuracy 95.56% > 95.46% [hops] 2024-09-24 17:19:02.822 | INFO | Epoch 28: train loss 0.0376 dev loss 0.2342 dev tag acc 99.21% dev head acc 95.75% dev deprel acc 97.56% [hops] 2024-09-24 17:19:02.823 | INFO | New best model: head accuracy 95.75% > 95.56% [hops] 2024-09-24 17:19:20.495 | INFO | Epoch 29: train loss 0.0365 dev loss 0.2271 dev tag acc 99.21% dev head acc 95.69% dev deprel acc 97.61% [hops] 2024-09-24 17:19:34.954 | INFO | Epoch 30: train loss 0.0349 dev loss 0.2359 dev tag acc 99.21% dev head acc 95.79% dev deprel acc 97.60% [hops] 2024-09-24 17:19:34.955 | INFO | New best model: head accuracy 95.79% > 95.75% [hops] 2024-09-24 17:19:52.083 | INFO | Epoch 31: train loss 0.0333 dev loss 0.2284 dev tag acc 99.22% dev head acc 95.68% dev deprel acc 97.60% [hops] 2024-09-24 17:20:07.114 | INFO | Epoch 32: train loss 0.0302 dev loss 0.2329 dev tag acc 99.23% dev head acc 95.50% dev deprel acc 97.64% [hops] 2024-09-24 17:20:20.909 | INFO | Epoch 33: train loss 0.0280 dev loss 0.2253 dev tag acc 99.27% dev head acc 95.56% dev deprel acc 97.70% [hops] 2024-09-24 17:20:36.250 | INFO | Epoch 34: train loss 0.0269 dev loss 0.2490 dev tag acc 99.20% dev head acc 95.74% dev deprel acc 97.69% [hops] 2024-09-24 17:20:51.100 | INFO | Epoch 35: train loss 0.0266 dev loss 0.2576 dev tag acc 99.21% dev head acc 95.74% dev deprel acc 97.78% [hops] 2024-09-24 17:21:04.672 | INFO | Epoch 36: train loss 0.0255 dev loss 0.2525 dev tag acc 99.31% dev head acc 95.81% dev deprel acc 97.75% [hops] 2024-09-24 17:21:04.673 | INFO | New best model: head accuracy 95.81% > 95.79% [hops] 2024-09-24 17:21:20.986 | INFO | Epoch 37: train loss 0.0226 dev loss 0.2545 dev tag acc 99.29% dev head acc 95.85% dev deprel acc 97.67% [hops] 2024-09-24 17:21:20.987 | INFO | New best model: head accuracy 95.85% > 95.81% [hops] 2024-09-24 17:21:38.097 | INFO | Epoch 38: train loss 0.0224 dev loss 0.2743 dev tag acc 99.26% dev head acc 95.97% dev deprel acc 97.61% [hops] 2024-09-24 17:21:38.098 | INFO | New best model: head accuracy 95.97% > 95.85% [hops] 2024-09-24 17:21:54.248 | INFO | Epoch 39: train loss 0.0213 dev loss 0.2854 dev tag acc 99.26% dev head acc 95.75% dev deprel acc 97.66% [hops] 2024-09-24 17:22:09.077 | INFO | Epoch 40: train loss 0.0212 dev loss 0.2520 dev tag acc 99.26% dev head acc 95.94% dev deprel acc 97.63% [hops] 2024-09-24 17:22:24.533 | INFO | Epoch 41: train loss 0.0198 dev loss 0.2570 dev tag acc 99.31% dev head acc 96.04% dev deprel acc 97.81% [hops] 2024-09-24 17:22:24.534 | INFO | New best model: head accuracy 96.04% > 95.97% [hops] 2024-09-24 17:22:41.309 | INFO | Epoch 42: train loss 0.0179 dev loss 0.2711 dev tag acc 99.30% dev head acc 95.95% dev deprel acc 97.74% [hops] 2024-09-24 17:22:56.619 | INFO | Epoch 43: train loss 0.0166 dev loss 0.2740 dev tag acc 99.27% dev head acc 96.03% dev deprel acc 97.86% [hops] 2024-09-24 17:23:11.247 | INFO | Epoch 44: train loss 0.0168 dev loss 0.2802 dev tag acc 99.27% dev head acc 96.07% dev deprel acc 97.83% [hops] 2024-09-24 17:23:11.248 | INFO | New best model: head accuracy 96.07% > 96.04% [hops] 2024-09-24 17:23:29.041 | INFO | Epoch 45: train loss 0.0163 dev loss 0.2719 dev tag acc 99.28% dev head acc 96.19% dev deprel acc 97.87% [hops] 2024-09-24 17:23:29.042 | INFO | New best model: head accuracy 96.19% > 96.07% [hops] 2024-09-24 17:23:46.148 | INFO | Epoch 46: train loss 0.0180 dev loss 0.2666 dev tag acc 99.26% dev head acc 96.01% dev deprel acc 97.86% [hops] 2024-09-24 17:24:01.336 | INFO | Epoch 47: train loss 0.0142 dev loss 0.2792 dev tag acc 99.29% dev head acc 96.07% dev deprel acc 97.83% [hops] 2024-09-24 17:24:16.066 | INFO | Epoch 48: train loss 0.0134 dev loss 0.2820 dev tag acc 99.27% dev head acc 96.06% dev deprel acc 97.79% [hops] 2024-09-24 17:24:31.201 | INFO | Epoch 49: train loss 0.0137 dev loss 0.2877 dev tag acc 99.32% dev head acc 96.13% dev deprel acc 97.85% [hops] 2024-09-24 17:24:46.077 | INFO | Epoch 50: train loss 0.0130 dev loss 0.2910 dev tag acc 99.28% dev head acc 96.11% dev deprel acc 97.91% [hops] 2024-09-24 17:25:01.474 | INFO | Epoch 51: train loss 0.0120 dev loss 0.3076 dev tag acc 99.27% dev head acc 96.06% dev deprel acc 97.86% [hops] 2024-09-24 17:25:15.876 | INFO | Epoch 52: train loss 0.0114 dev loss 0.3043 dev tag acc 99.28% dev head acc 96.13% dev deprel acc 97.86% [hops] 2024-09-24 17:25:31.219 | INFO | Epoch 53: train loss 0.0113 dev loss 0.3071 dev tag acc 99.26% dev head acc 96.07% dev deprel acc 97.89% [hops] 2024-09-24 17:25:46.377 | INFO | Epoch 54: train loss 0.0103 dev loss 0.3065 dev tag acc 99.27% dev head acc 96.25% dev deprel acc 97.94% [hops] 2024-09-24 17:25:46.378 | INFO | New best model: head accuracy 96.25% > 96.19% [hops] 2024-09-24 17:26:03.659 | INFO | Epoch 55: train loss 0.0104 dev loss 0.3091 dev tag acc 99.27% dev head acc 96.19% dev deprel acc 97.88% [hops] 2024-09-24 17:26:18.791 | INFO | Epoch 56: train loss 0.0098 dev loss 0.3122 dev tag acc 99.27% dev head acc 96.05% dev deprel acc 97.84% [hops] 2024-09-24 17:26:34.137 | INFO | Epoch 57: train loss 0.0094 dev loss 0.3159 dev tag acc 99.26% dev head acc 96.07% dev deprel acc 97.82% [hops] 2024-09-24 17:26:49.249 | INFO | Epoch 58: train loss 0.0094 dev loss 0.3203 dev tag acc 99.28% dev head acc 96.15% dev deprel acc 97.87% [hops] 2024-09-24 17:27:04.725 | INFO | Epoch 59: train loss 0.0082 dev loss 0.3228 dev tag acc 99.28% dev head acc 96.17% dev deprel acc 97.92% [hops] 2024-09-24 17:27:20.299 | INFO | Epoch 60: train loss 0.0089 dev loss 0.3213 dev tag acc 99.29% dev head acc 96.13% dev deprel acc 97.92% [hops] 2024-09-24 17:27:35.759 | INFO | Epoch 61: train loss 0.0082 dev loss 0.3217 dev tag acc 99.29% dev head acc 96.19% dev deprel acc 97.93% [hops] 2024-09-24 17:27:51.235 | INFO | Epoch 62: train loss 0.0082 dev loss 0.3231 dev tag acc 99.29% dev head acc 96.24% dev deprel acc 97.91% [hops] 2024-09-24 17:28:06.104 | INFO | Epoch 63: train loss 0.0085 dev loss 0.3223 dev tag acc 99.29% dev head acc 96.24% dev deprel acc 97.90% [hops] 2024-09-24 17:28:11.404 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. [hops] 2024-09-24 17:28:16.852 | WARNING | You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. [hops] 2024-09-24 17:28:18.856 | INFO | Metrics for Sequoia-camembertv2_base_p2_17k_last_layer+rand_seed=42 ─────────────────────────────── Split UPOS UAS LAS ─────────────────────────────── Dev 99.27 96.30 95.11 Test 99.38 96.12 94.94 ───────────────────────────────